Pandas和IPython的可变可见性问题

时间:2014-12-24 13:40:17

标签: python pandas namespaces ipython

当我尝试按选定的索引值(在IPython会话中)过滤DataFrame时,我收到了NameError异常。您可以看到validnumpy.arraylabpandas.DataFrame对象。它们都是初始化和可访问的。但是我不能把它们放在一起。这是错误:

In [51]: valid
Out[51]: 
array([38661, 44593, 38705, 38918, 38727, 38757, 38751, 38777, 38787,
       ...,    
       45328, 45337, 43645, 43694, 43701])

In [52]: lab
Out[52]: 
         0
39333   -1
39173   -1
42756   -1
39633   -1
38661   -1
44801   81
...    ...
39379   -1
39742   -1
44765  108
44279   -1
40584   -1
41047   -1
41833   98

[3299 rows x 1 columns]

In [53]: lab[lab.index.map(lambda x: x in valid)]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/django/core/management/commands/shell.pyc in <module>()
----> 1 lab[lab.index.map(lambda x: x in valid)]

/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/pandas/core/index.pyc in map(self, mapper)
   1558 
   1559     def map(self, mapper):
-> 1560         return self._arrmap(self.values, mapper)
   1561 
   1562     def isin(self, values, level=None):

/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/pandas/algos.so in pandas.algos.arrmap_int64 (pandas/algos.c:78469)()

/home/vitaly/progs/vnii_gochs/venv/lib/python2.7/site-packages/django/core/management/commands/shell.pyc in <lambda>(x)
----> 1 lab[lab.index.map(lambda x: x in valid)]

NameError: global name 'valid' is not defined

此代码有什么问题?

UPD: lab.pkl(泡菜格式),valid.npy(numpy二进制格式)

1 个答案:

答案 0 :(得分:1)

目前尚不清楚您是否尝试向lab添加新列,或者您是否尝试按valid数组中指定的顺序获取值。要向lab添加新列,您可以执行lab['new'] = valid。要根据valid数组中的值获取Series对象,您可以执行lab.loc[value]。如果你只想要原始numpy数组做lab.loc[value].values