Question

我试图通过传递一个列表并与一个也包含列表的数据框列进行比较，使用isin（）函数过滤一个数据框。这是以下问题的扩展：

How to implement 'in' and 'not in' for Pandas dataframe

例如，现在每行不再包含一个国家，而是每行包含一个国家列表。

df = pd.DataFrame({'countries':[['US', 'UK'], ['UK'], ['Germany', 'France'], ['China']]})

要进行过滤，我设置了两个单独的列表：

countries = ['UK','US']
countries_2 = ['UK']

预期结果应该相同，因为第0行和第1行都包含英国和/或美国

>>> df[df.countries.isin(countries)]
  countries
0     US, UK
1         UK
>>> df[~df.countries.isin(countries_2)]
  countries
0     US, UK
1         UK

但是Python抛出以下错误

TypeError: unhashable type: 'list'

Answer 1

一个可能的解决方案，其中包含集合，issubset或isdisjoint和map：

print (df[df.countries.map(set(countries).issubset)])
  countries
0  [US, UK]

print (df[~df.countries.map(set(countries).isdisjoint)])
  countries
0  [US, UK]
1      [UK]

print (df[df.countries.map(set(countries_2).issubset)])
  countries
0  [US, UK]
1      [UK]

print (df[~df.countries.map(set(countries_2).isdisjoint)])
  countries
0  [US, UK]
1      [UK]

Pandas .isin在包含列表的列条目上

1 个答案: