什么是在Pandas中使用SparseDataFrame的正确方法?

时间:2014-08-18 07:40:28

标签: pandas

pandas.DataFrame(a)
Out[41]: 
   1   2   3
0  1   2 NaN
1  1 NaN   3

a = [{1:1.0,2:2.0}, {1:1.0,3:3.0}]

pandas.DataFrame(a)
Out[43]: 
   1   2   3
0  1   2 NaN
1  1 NaN   3

pandas.SparseDataFrame(a)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-44-50288c1b1994> in <module>()
----> 1 pandas.SparseDataFrame(a)

/Library/Python/2.7/site-packages/pandas/sparse/frame.pyc in __init__(self, data, index, columns, default_kind, default_fill_value)
     94             sdict, columns, index = self._init_dict(data, index, columns)
     95         elif isinstance(data, (np.ndarray, list)):
---> 96             sdict, columns, index = self._init_matrix(data, index, columns)
     97         elif isinstance(data, DataFrame):
     98             sdict, columns, index = self._init_dict(data, data.index,

/Library/Python/2.7/site-packages/pandas/sparse/frame.pyc in _init_matrix(self, data, index, columns, dtype)
    203 
    204         data = dict([(idx, data[:, i]) for i, idx in enumerate(columns)])
--> 205         return self._init_dict(data, index, columns, dtype)
    206 
    207     def __array_wrap__(self, result):

/Library/Python/2.7/site-packages/pandas/sparse/frame.pyc in _init_dict(self, data, index, columns, dtype)
    174                     v = [v.get(i, nan) for i in index]
    175 
--> 176                 v = sp_maker(v)
    177             sdict[k] = v
    178 

/Library/Python/2.7/site-packages/pandas/sparse/frame.pyc in <lambda>(x)
    159                                           kind=self.default_kind,
    160                                           fill_value=self.default_fill_value,
--> 161                                           copy=True)
    162 
    163         sdict = {}

/Library/Python/2.7/site-packages/pandas/sparse/series.pyc in __new__(cls, data, index, sparse_index, kind, fill_value, name, copy)
    127             if sparse_index is None:
    128                 values, sparse_index = make_sparse(data, kind=kind,
--> 129                                                    fill_value=fill_value)
    130             else:
    131                 values = data

/Library/Python/2.7/site-packages/pandas/sparse/array.pyc in make_sparse(arr, kind, fill_value)
    426 
    427     if np.isnan(fill_value):
--> 428         mask = -np.isnan(arr)
    429     else:
    430         mask = arr != fill_value

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule 'safe'

1 个答案:

答案 0 :(得分:0)

熊猫> = 1.0

没有。大熊猫现在支持Extension Types,因此已从API中删除了SparseDataFrameSparseSeries

使用扩展名SparseArray来声明稀疏列。

以前的方式

pd.SparseDataFrame({"A": [0, 1]})

新方法[✓]

pd.DataFrame({"A": pd.arrays.SparseArray([0, 1])})

   A
0  0
1  1