Pandas set_index Multiindex Lookup

时间:2014-10-02 16:50:14

标签: python pandas

我找不到在Pandas 0.14中查找multiindex的方法。这是一些我遇到问题的模拟数据。

代码:

row1 = ['red', 'ferrari', 'mine']
row2 = ['blue', 'ferrari', 'his']
row3 = ['red', 'lambo', 'his']
row4 = ['yellow', 'porsche', 'his']
row5 = ['yellow', 'lambo', 'his']
all_dat = [row1, row2, row3, row4, row5]
df = DataFrame(all_dat, columns=['Color', 'Make', 'Ownership'])

print df
df = df.set_index(['Color', 'Make'])
print df

print df['red']['lambo']
print df['yellow']['porsche']

输出:

    Color     Make Ownership
0     red  ferrari      mine
1    blue  ferrari       his
2     red    lambo       his
3  yellow  porsche       his
4  yellow    lambo       his
               Ownership
Color  Make             
red    ferrari      mine
blue   ferrari       his
red    lambo         his
yellow porsche       his
       lambo         his

Traceback (most recent call last):
    print df['red']['lambo']
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1678, in __getitem__
    return self._getitem_column(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1685, in _getitem_column
    return self._get_item_cache(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 1052, in _get_item_cache
    values = self._data.get(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 2565, in get
    loc = self.items.get_loc(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 1181, in get_loc
    return self._engine.get_loc(_values_from_object(key))
  File "index.pyx", line 129, in pandas.index.IndexEngine.get_loc (pandas/index.c:3354)
  File "index.pyx", line 149, in pandas.index.IndexEngine.get_loc (pandas/index.c:3234)
  File "hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:11148)
  File "hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:11101)
KeyError: 'red'

我尝试使用

进行查找
df[('red', 'lambo')]

df['red', 'lambo']

这些结果有类似的结果(KeyErrors)。

那么,在设置多索引时,我是否缺少某种步骤?我想使用set_index()作为我的真实数据(这只是模拟数据)在它重新定义索引之前对其执行了许多操作。

1 个答案:

答案 0 :(得分:2)

使用df.loc,您可以将所需标签指定为元组列表:

In [99]: df.loc[[('red','lambo')]]
Out[99]: 
            Ownership
Color Make           
red   lambo       his

In [106]: df.loc[[('yellow','porsche'), ('red','lambo')]]
Out[106]: 
               Ownership
Color  Make             
yellow porsche       his
red    lambo         his

可以像这样进行分配:

In [117]: df.loc[[('red', 'lambo')], 'Ownership'] = 'mine'

In [118]: df
Out[118]: 
               Ownership
Color  Make             
red    ferrari      mine
blue   ferrari       his
red    lambo        mine
yellow porsche       his
       lambo         his

另请参阅:Advanced indexing with hierarchical index