pandas稀疏数据帧value_counts无法正常工作

时间:2014-03-27 12:26:22

标签: python pandas typeerror sparse-matrix

当我使用value_counts方法时,我遇到带有pandas稀疏数据帧的TypeError。我列出了我正在使用的软件包的版本。

有关如何使这项工作的任何建议?

提前致谢。另外,如果需要更多信息,请告诉我。

Python 2.7.6 |Anaconda 1.9.1 (x86_64)| (default, Jan 10 2014, 11:23:15) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> import pandas
>>> print pandas.__version__
0.13.1
>>> import numpy
>>> print numpy.__version__
1.8.0

>>> dense_df = pandas.DataFrame(numpy.zeros((10, 10))
                               ,columns=['x%d' % ix for ix in range(10)])
>>> dense_df['x5'] = [1.0, 0.0, 0.0, 1.0, 2.1, 3.0, 0.0, 0.0, 0.0, 0.0]
>>> print dense_df['x5'].value_counts()
0.0    6
1.0    2
3.0    1
2.1    1
dtype: int64

>>> sparse_df = dense_df.to_sparse(fill_value=0) # Tried fill_value=0.0 also
>>> print sparse_df.density
0.04

>>> print sparse_df['x5'].value_counts()
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "//anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 1156, in     value_counts
    normalize=normalize, bins=bins)
 File "//anaconda/lib/python2.7/site-packages/pandas/core/algorithms.py", line 231, in value_counts
    values = com._ensure_object(values)
  File "generated.pyx", line 112, in pandas.algos.ensure_object (pandas/algos.c:38788)
  File "generated.pyx", line 117, in pandas.algos.ensure_object (pandas/algos.c:38695)
  File "//anaconda/lib/python2.7/site-packages/pandas/sparse/array.py", line 377, in astype
    raise TypeError('Can only support floating point data for now')
TypeError: Can only support floating point data for now

1 个答案:

答案 0 :(得分:2)

这不是ATM实现的,首先转换为密集。

In [12]: sparse_df['x5'].to_dense().value_counts()
Out[12]: 
0.0    6
1.0    2
3.0    1
2.1    1
dtype: int64