Question

我尝试选择数据框子集的子集，只选择一些列，然后对这些行进行过滤。

df.loc[df.a.isin(['Apple', 'Pear', 'Mango']), ['a', 'b', 'f', 'g']]

但是，我收到了错误：

Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

现在切片和过滤的正确方法是什么？

Answer 1

这是v0.21.1中引入的更改，已在docs详细解释 -

以前，选择标签列表，其中包含一个或多个标签失踪将永远成功，返回NaN缺少标签。现在显示FutureWarning。将来这会提高一个 KeyError（GH15747）。此警告将在DataFrame或a上触发 Series用于在使用at传递标签列表时使用.loc[]或[[]] 至少缺少一个标签。

例如，

df

     A    B  C
0  7.0  NaN  8
1  3.0  3.0  5
2  8.0  1.0  7
3  NaN  0.0  3
4  8.0  2.0  7

尝试某种切片 -

df.loc[df.A.gt(6), ['A', 'C']]

     A  C
0  7.0  8
2  8.0  7
4  8.0  7

没问题。现在，尝试使用不存在的列标签替换C -

df.loc[df.A.gt(6), ['A', 'D']]
FutureWarning: Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

     A   D
0  7.0 NaN
2  8.0 NaN
4  8.0 NaN

因此，在您的情况下，错误是由于您传递给loc的列标签。再看看它们。

Answer 2

如果要保留索引，可以传递列表理解而不是列列表：

(select CITY, length(CITY) from STATION order by length(CITY),CITY limit 1)
UNION
(select CITY, length(CITY) from STATION order by length(CITY) DESC limit 1);

Answer 3

当列表包含新列时，.append调用也会发生此错误。为了避免这种情况

使用：

df=df.append(pd.Series({'A':i,'M':j}), ignore_index=True)

而不是

df=df.append([{'A':i,'M':j}], ignore_index=True)

完整错误消息：

C：\ ProgramData \ Anaconda3 \ lib \ site-packages \ pandas \ core \ indexing.py：1472： FutureWarning：将类似列表的内容传递给.loc或带有任何丢失的标签将来会引发KeyError，您可以使用.reindex（）作为替代。

感谢https://stackoverflow.com/a/50230080/207661

Answer 4

抱歉，我不确定我是否正确地理解了您，但似乎下一步可以为您所接受：

df[df['a'].isin(['Apple', 'Pear', 'Mango'])][['a', 'b', 'f', 'g']]

摘录说明：

df['a'].isin(['Apple', 'Pear', 'Mango']) # it's "filter" by data in each row in column *a*

df[['a', 'b', 'f', 'g']] # it's "column filter" that provide ability select specific columns set

Pandas将FutureWarning切成0.21.0

4 个答案: