Question

我正在尝试过滤pandas DataFrame df，只保留包含mylist = ['cat', 'mouse']中特定列df['Title']中的几个字符串之一的行：

  df.head()

                     Title               Duration   ...       
    0             The Cat1 & Mouse2       33 min    ...          
    1             Legend of the cat       10 min    ...        
    2             Foo-Bar                 3 min     ...      
    3             Legend of Mousopia      5 min     ...          
    4             Cat + Mouse             7 min     ...

查看类似的问题，我尝试通过执行以下操作来过滤df：

z = df['Title'].str.lower()

df = df[z.contains([x for x in mylist])]

期待df.head()看起来像：

                     Title               Duration   ...       
    0             The Cat1 & Mouse2       33 min    ...          
    1             Legend of the cat       10 min    ...                 
    4             Cat + Mouse             7 min     ...

但是，我一直收到以下错误：

AttributeError: 'Series' object has no attribute 'contains'

我已更新conda和pandas，但仍会获得相同的结果。

conda version : 4.5.4
conda-build version : 3.8.0
python version : 3.6.5.final.0
pandas version : 0.23.0           py36h830ac7b_0

我错过了什么？

Answer 1

尝试使用df['Title'].str.contains(*my_list)。

如何使用字符串列表过滤pandas DataFrame

1 个答案: