pandas.get_dummies()抛出ValueError:传递的项目数量错误4,索引意味着3

时间:2016-03-17 02:56:57

标签: python pandas

我有一个基于熊猫2的ipython笔记本,叫做pandas' get_dummies()。此函数将分类变量转换为虚拟/指示变量。它适用于一台机器但不适用于另一台机器。两台机器都运行linux mint,python 2.7。请参阅下面的最小示例。

我在其他一些帖子上看到了错误(ValueError:传递的项目数量错误4,索引意味着3),但是解决方法没有帮助,而且我编写的代码在另一台机器上工作。 知道该怎么办?例如,如何比较ipython / jupiter的两个安装和包?

import pandas as pandas
df = pandas.DataFrame({ 'A' : pandas.Series(1,index=list(range(4)),dtype='float32'),
                     'B' : 2.,
                     'C' : pandas.Categorical(["test","train","test","train"])})
print df
pandas.get_dummies(df)

输出:

A  B      C
0  1  2   test
1  1  2  train
2  1  2   test
3  1  2  train

[4 rows x 3 columns]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-cf3a14671e3b> in <module>()
      5                      'C' : pd.Categorical(["test","train","test","train"])})
      6 print df
----> 7 pd.get_dummies(df)

/usr/lib/python2.7/dist-packages/pandas/core/reshape.pyc in get_dummies(data, prefix, prefix_sep, dummy_na)
    946     """
    947     # Series avoids inconsistent NaN handling
--> 948     cat = Categorical.from_array(Series(data))
    949     levels = cat.levels
    950 

/usr/lib/python2.7/dist-packages/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath)
    220                                        raise_cast_failure=True)
    221 
--> 222                 data = SingleBlockManager(data, index, fastpath=True)
    223 
    224         generic.NDFrame.__init__(self, data, fastpath=True)

/usr/lib/python2.7/dist-packages/pandas/core/internals.pyc in __init__(self, block, axis, do_integrity_check, fastpath)
   3591                 block = block[0]
   3592             if not isinstance(block, Block):
-> 3593                 block = make_block(block, axis, axis, ndim=1, fastpath=True)
   3594 
   3595         else:

/usr/lib/python2.7/dist-packages/pandas/core/internals.pyc in make_block(values, items, ref_items, klass, ndim, dtype, fastpath, placement)
   1991 
   1992     return klass(values, items, ref_items, ndim=ndim, fastpath=fastpath,
-> 1993                  placement=placement)
   1994 
   1995 

/usr/lib/python2.7/dist-packages/pandas/core/internals.pyc in __init__(self, values, items, ref_items, ndim, fastpath, placement)
   1356         super(ObjectBlock, self).__init__(values, items, ref_items, ndim=ndim,
   1357                                           fastpath=fastpath,
-> 1358                                           placement=placement)
   1359 
   1360     @property

/usr/lib/python2.7/dist-packages/pandas/core/internals.pyc in __init__(self, values, items, ref_items, ndim, fastpath, placement)
     62         if len(items) != len(values):
     63             raise ValueError('Wrong number of items passed %d, indices imply '
---> 64                              '%d' % (len(items), len(values)))
     65 
     66         self.set_ref_locs(placement)

ValueError: Wrong number of items passed 4, indices imply 3

1 个答案:

答案 0 :(得分:0)

根据我的经验,问题是在使用旧版本的pandas(0.13.X)的计算机上运行时代码失败,并且使用最新的pandas软件包运行正常(0.19.1)在另一台机器上(谢谢你,dartdog,你有关比较软件包版本和pip列表的建议)。

如果您的代码使用setup.py打包,则可以强制执行包版本:

install_requires = ['pandas>=0.19.1']

Apparently Buildout honors setuptools,因此当您安装软件包及其在setup.py中指定的依赖项时,它将检查正确的版本并根据需要进行更新。

如果您所在的计算机上没有直接更新python库的权限,请使用带有pip的--user标志来更新本地用户库:

pip install --user foo

--upgrade标志将强制更新软件包,因此如果所有其他方法都失败了,那么您可以尝试使用pip将软件包升级到正确的版本。