我正在尝试使用类似于here
的布尔系列索引数据框In [1]: import pandas as pd
In [2]: idx = pd.Index(["USD.CAD", "AUD.NZD", "EUR.USD", "GBP.USD"],
...: name="Currency Pair")
In [3]: pairs = pd.DataFrame({"mean":[3.6,5.1,3.6,2.7], "count":[1,5,8,2]}, index=idx)
In [4]: mask = pairs.reset_index().loc[:,"Currency Pair"].str.contains("USD")
In [5]: pairs.reset_index()[mask]
Out[5]:
Currency Pair count mean
0 USD.CAD 1 3.6
2 EUR.USD 8 3.6
3 GBP.USD 2 2.7
上面按预期工作但是当我尝试使用原始数据帧而不重置索引时,我得到以下错误
In [6]: pairs[mask]
C:\Anaconda\lib\site-packages\pandas\core\frame.py:1808: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
"DataFrame index.", UserWarning)
---------------------------------------------------------------------------
IndexingError Traceback (most recent call last)
<ipython-input-6-9eca5ffbdaf7> in <module>()
----> 1 pairs[mask]
C:\Anaconda\lib\site-packages\pandas\core\frame.pyc in __getitem__(self, key)
1772 if isinstance(key, (Series, np.ndarray, Index, list)):
1773 # either boolean or fancy integer index
-> 1774 return self._getitem_array(key)
1775 elif isinstance(key, DataFrame):
1776 return self._getitem_frame(key)
C:\Anaconda\lib\site-packages\pandas\core\frame.pyc in _getitem_array(self, key)
1812 # _check_bool_indexer will throw exception if Series key cannot
1813 # be reindexed to match DataFrame rows
-> 1814 key = _check_bool_indexer(self.index, key)
1815 indexer = key.nonzero()[0]
1816 return self.take(indexer, axis=0, convert=False)
C:\Anaconda\lib\site-packages\pandas\core\indexing.pyc in _check_bool_indexer(ax, key)
1637 mask = com.isnull(result.values)
1638 if mask.any():
-> 1639 raise IndexingError('Unalignable boolean Series key provided')
1640
1641 result = result.astype(bool).values
IndexingError: Unalignable boolean Series key provided
我对此错误感到困惑,因为我的印象是,如果布尔索引长度与数据帧的长度不同,这是一个错误?情况并非如下所示。
In [7]: len(mask)
Out[7]: 4
In [8]: len(pairs)
Out[8]: 4
In [9]: len(pairs.reset_index())
Out[9]: 4
答案 0 :(得分:4)
我想我会在评论中说明解决方案@EdChum。他指出的问题是mask.index与pairs.index不一致。用对数中的索引替换掩码索引,我们得到了预期的行为。
In[10]: mask.index = pairs.index.copy()
In[11]: pairs[mask]
Out[11]:
count mean
Currency Pair
USD.CAD 1 3.6
EUR.USD 8 3.6
GBP.USD 2 2.7
答案 1 :(得分:2)
您可以直接使用从索引生成的掩码。
In [22]: mask = pairs.index.str.contains("USD")
In [23]: pairs[mask]
Out[23]:
count mean
Currency Pair
USD.CAD 1 3.6
EUR.USD 8 3.6
GBP.USD 2 2.7
无需重新索引任何内容。