Question

以下是玩具示例的设置：

data = [['a',  1],
        ['b',  2],
        ['a',  3],
        ['b',  1],
        ['c',  2],
        ['c',  3],
        ['b',  1]]

colnames = tuple('XY')

df = pd.DataFrame(co.OrderedDict([(colnames[i],
                                   [row[i] for row in data])
                                  for i in range(len(colnames))]))

好的，要获得与Series列中的值是否等于X相对应的布尔指示符'a'对象（适合索引），我可以这样做：

In [230]: df['X'] == 'a'
Out[230]:
0     True
1    False
2     True
3    False
4    False
5    False
6    False
Name: X, dtype: bool

很好，但我真正想做的是测试值是否是几个可能的值之一。我尝试使用set包含这个，但它炸弹：

In [231]: df['X'] in set(['a', 'b'])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-266-0819ab764ce2> in <module>()
----> 1 df['X'] in set(['a', 'b'])

/Users/yt/.virtualenvs/pd/lib/python2.7/site-packages/pandas/core/generic.pyc in __hash__(self)
    639     def __hash__(self):
    640         raise TypeError('{0!r} objects are mutable, thus they cannot be'
--> 641                         ' hashed'.format(self.__class__.__name__))
    642
    643     def __iter__(self):

TypeError: 'Series' objects are mutable, thus they cannot be hashed

我怎样才能做到这一点？

注意：对于我正在使用的情况，允许值的集合很大，并且仅在运行时才知道，因此or表达式是不可能的。

如何生成一个布尔值＆＃34;指标系列＆＃34;基于包含在一组？

0 个答案: