我有一个基于熊猫2的ipython笔记本,叫做pandas' get_dummies()。此函数将分类变量转换为虚拟/指示变量。它适用于一台机器但不适用于另一台机器。两台机器都运行linux mint,python 2.7。请参阅下面的最小示例。
我在其他一些帖子上看到了错误(ValueError:传递的项目数量错误4,索引意味着3),但是解决方法没有帮助,而且我编写的代码在另一台机器上工作。 知道该怎么办?例如,如何比较ipython / jupiter的两个安装和包?
import pandas as pandas
df = pandas.DataFrame({ 'A' : pandas.Series(1,index=list(range(4)),dtype='float32'),
'B' : 2.,
'C' : pandas.Categorical(["test","train","test","train"])})
print df
pandas.get_dummies(df)
输出:
A B C
0 1 2 test
1 1 2 train
2 1 2 test
3 1 2 train
[4 rows x 3 columns]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-2-cf3a14671e3b> in <module>()
5 'C' : pd.Categorical(["test","train","test","train"])})
6 print df
----> 7 pd.get_dummies(df)
/usr/lib/python2.7/dist-packages/pandas/core/reshape.pyc in get_dummies(data, prefix, prefix_sep, dummy_na)
946 """
947 # Series avoids inconsistent NaN handling
--> 948 cat = Categorical.from_array(Series(data))
949 levels = cat.levels
950
/usr/lib/python2.7/dist-packages/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath)
220 raise_cast_failure=True)
221
--> 222 data = SingleBlockManager(data, index, fastpath=True)
223
224 generic.NDFrame.__init__(self, data, fastpath=True)
/usr/lib/python2.7/dist-packages/pandas/core/internals.pyc in __init__(self, block, axis, do_integrity_check, fastpath)
3591 block = block[0]
3592 if not isinstance(block, Block):
-> 3593 block = make_block(block, axis, axis, ndim=1, fastpath=True)
3594
3595 else:
/usr/lib/python2.7/dist-packages/pandas/core/internals.pyc in make_block(values, items, ref_items, klass, ndim, dtype, fastpath, placement)
1991
1992 return klass(values, items, ref_items, ndim=ndim, fastpath=fastpath,
-> 1993 placement=placement)
1994
1995
/usr/lib/python2.7/dist-packages/pandas/core/internals.pyc in __init__(self, values, items, ref_items, ndim, fastpath, placement)
1356 super(ObjectBlock, self).__init__(values, items, ref_items, ndim=ndim,
1357 fastpath=fastpath,
-> 1358 placement=placement)
1359
1360 @property
/usr/lib/python2.7/dist-packages/pandas/core/internals.pyc in __init__(self, values, items, ref_items, ndim, fastpath, placement)
62 if len(items) != len(values):
63 raise ValueError('Wrong number of items passed %d, indices imply '
---> 64 '%d' % (len(items), len(values)))
65
66 self.set_ref_locs(placement)
ValueError: Wrong number of items passed 4, indices imply 3
答案 0 :(得分:0)
根据我的经验,问题是在使用旧版本的pandas(0.13.X)的计算机上运行时代码失败,并且使用最新的pandas软件包运行正常(0.19.1)在另一台机器上(谢谢你,dartdog,你有关比较软件包版本和pip列表的建议)。
如果您的代码使用setup.py打包,则可以强制执行包版本:
install_requires = ['pandas>=0.19.1']
Apparently Buildout honors setuptools,因此当您安装软件包及其在setup.py中指定的依赖项时,它将检查正确的版本并根据需要进行更新。
如果您所在的计算机上没有直接更新python库的权限,请使用带有pip的--user标志来更新本地用户库:
pip install --user foo
--upgrade
标志将强制更新软件包,因此如果所有其他方法都失败了,那么您可以尝试使用pip将软件包升级到正确的版本。