Pivot to SparseDataFrame: TypeError: ufunc 'isnan' not supported in sparse matrix conversion #11633

DSLituiev · 2015-11-18T01:10:01Z

I want to convert a DataFrame to SparseDataFrame before pivoting it (when it gets really sparse, see also this discussion ). I have a textual key, which I need to keep ("chr"):

df = pd.DataFrame( list(zip([3,2,4,1,5,3,2],
             ["chr1", "chr1", "chr1",  "chr1", "chr2", "chr2", "chr3"], 
            [100,100, 100, 200, 1,3,1],
            [True, True, True, False, True, False, True],
            [-1,0,1,3, 0,2,1])) ,
            columns = ["counts", "chr", "pos", "strand", "distance"])

df.iloc[:,1:].dtypes
Out[]: 
chr         object
pos          int64
strand        bool
distance     int64
dtype: object

For this small table it works well with regular DataFrame:

pd.pivot_table(df, index= [ "chr", "pos"], columns= ["strand","distance"], values= "counts").fillna(0)

     strand   False    True       
distance     2  3    -1  0  1
chr  pos                     
chr1 100     0  0     3  2  4
     200     0  1     0  0  0
chr2 1       0  0     0  5  0
     3       3  0     0  0  0
chr3 1       0  0     0  0  2

But I need to do it on much larger matrices. So I tried to do following trick:

dfpiv = pd.pivot_table(pd.SparseDataFrame(df), index= [ "chr", "pos"], columns= ["strand","distance"], values= "counts")

but I am getting:

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Are there any plans to include a functionality option into pivot function for automatic conversion into SparseDataFrame?

The text was updated successfully, but these errors were encountered:

DSLituiev · 2015-11-18T01:13:52Z

If I include default_fill_value=0, which makes sense in my case I get yet another error:

>>> dfsp = pd.SparseDataFrame(df, default_fill_value=0)
ValueError: could not convert string to float: '<value from "chr" column>'

jreback · 2015-11-18T12:03:36Z

you would have to show a copy-pastable example. and pd.show_versions()

DSLituiev · 2015-12-10T04:39:59Z

please see updated post with an example above

DSLituiev · 2015-12-10T04:47:08Z

pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.0
nose: 1.3.7
pip: 7.1.2
setuptools: 18.4
Cython: 0.23.4
numpy: 1.10.1
scipy: 0.16.0
statsmodels: None
IPython: 4.0.0
sphinx: 1.3.1
patsy: 0.3.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2.dev0
numexpr: 2.4.3
matplotlib: 1.4.3
openpyxl: 2.2.6
xlrd: 0.9.3
xlwt: 1.0.0
xlsxwriter: 0.7.3
lxml: 3.4.4
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.5
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)

jreback · 2015-12-10T12:09:34Z

this is quite easy to fix, need to replace ~np.isnan(arr) with pd.notnull(arr)

pull-requests are welcome

DSLituiev · 2015-12-16T21:38:58Z

Do you have a test file dedicated to sparse?

jreback · 2015-12-16T22:29:48Z

https://github.com/pydata/pandas/blob/master/pandas/sparse/tests/test_sparse.py

jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type labels Nov 18, 2015

jreback added Bug Difficulty Novice labels Dec 10, 2015

jreback added this to the Next Major Release milestone Dec 10, 2015

DSLituiev mentioned this issue Dec 16, 2015

fixed conversion to sparse for non-numeric index #11856

Closed

jreback modified the milestones: 0.18.1, Next Major Release Feb 23, 2016

jreback modified the milestones: 0.18.2, 0.18.1 Apr 18, 2016

sinhrks mentioned this issue May 17, 2016

BUG: Sparse creation with object dtype may raise TypeError #13201

Closed

jreback closed this as completed in 86f68e6 May 18, 2016

sinhrks mentioned this issue Aug 20, 2016

ENH: Sparse int64 and bool dtype support enhancement #13849

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pivot to SparseDataFrame: TypeError: ufunc 'isnan' not supported in sparse matrix conversion #11633

Pivot to SparseDataFrame: TypeError: ufunc 'isnan' not supported in sparse matrix conversion #11633

DSLituiev commented Nov 18, 2015

DSLituiev commented Nov 18, 2015

jreback commented Nov 18, 2015

DSLituiev commented Dec 10, 2015

DSLituiev commented Dec 10, 2015

jreback commented Dec 10, 2015

DSLituiev commented Dec 16, 2015

jreback commented Dec 16, 2015

Pivot to SparseDataFrame: TypeError: ufunc 'isnan' not supported in sparse matrix conversion #11633

Pivot to SparseDataFrame: TypeError: ufunc 'isnan' not supported in sparse matrix conversion #11633

Comments

DSLituiev commented Nov 18, 2015

DSLituiev commented Nov 18, 2015

jreback commented Nov 18, 2015

DSLituiev commented Dec 10, 2015

DSLituiev commented Dec 10, 2015

jreback commented Dec 10, 2015

DSLituiev commented Dec 16, 2015

jreback commented Dec 16, 2015