当我尝试按选定的索引值(在IPython会话中)过滤DataFrame时,我收到了NameError异常。您可以看到valid
是numpy.array
而lab
是pandas.DataFrame
对象。它们都是初始化和可访问的。但是我不能把它们放在一起。这是错误:
In [51]: valid
Out[51]:
array([38661, 44593, 38705, 38918, 38727, 38757, 38751, 38777, 38787,
...,
45328, 45337, 43645, 43694, 43701])
In [52]: lab
Out[52]:
0
39333 -1
39173 -1
42756 -1
39633 -1
38661 -1
44801 81
... ...
39379 -1
39742 -1
44765 108
44279 -1
40584 -1
41047 -1
41833 98
[3299 rows x 1 columns]
In [53]: lab[lab.index.map(lambda x: x in valid)]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/django/core/management/commands/shell.pyc in <module>()
----> 1 lab[lab.index.map(lambda x: x in valid)]
/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/pandas/core/index.pyc in map(self, mapper)
1558
1559 def map(self, mapper):
-> 1560 return self._arrmap(self.values, mapper)
1561
1562 def isin(self, values, level=None):
/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/pandas/algos.so in pandas.algos.arrmap_int64 (pandas/algos.c:78469)()
/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/django/core/management/commands/shell.pyc in <lambda>(x)
----> 1 lab[lab.index.map(lambda x: x in valid)]
NameError: global name 'valid' is not defined
此代码有什么问题?
答案 0 :(得分:1)
目前尚不清楚您是否尝试向lab
添加新列,或者您是否尝试按valid
数组中指定的顺序获取值。要向lab
添加新列,您可以执行lab['new'] = valid
。要根据valid
数组中的值获取Series对象,您可以执行lab.loc[value]
。如果你只想要原始numpy数组做lab.loc[value].values