我有一个df:
d = {'id': [1,2,3,4,5,6,7,8,9,10],
'text': ['bill did this', 'jim did something', 'phil', 'bill did nothing',
'carl was here', 'this is random', 'other name',
'other bill', 'bill and carl', 'last one']}
df = pd.DataFrame(data=d)
我想检查一列是否包含列表中的值,其中列表为:
list = ['bill','carl']
然后我想返回类似的内容
id text contains
1 bill did this bill
2 jim did something
3 phil
4 bill did nothing bill
5 carl was here carl
6 this is random
7 other name
8 other bill bill
9 bill and carl bill
9 bill and carl carl
10 last one
尽管处理同一行中2个或更多名称的方式可以更改。 有什么建议吗?
答案 0 :(得分:3)
您可以创建一个lambda函数来检查列表中的每个项目:
d = {'id': [1,2,3,4,5,6,7,8,9,10],
'text': ['bill did this', 'jim did something', 'phil', 'bill did nothing',
'carl was here', 'this is random', 'other name',
'other bill', 'bill and carl', 'last one']}
df = pd.DataFrame(data=d)
l = ['bill','carl']
df['contains'] = df['text'].apply(lambda x: ','.join([i for i in l if i in x]))
如果需要列表,可以删除连接,否则它将只连接以逗号分隔的值。
输出
>>df['contains']
0 bill
1
2
3 bill
4 carl
5
6
7 bill
8 bill,carl
9
Name: contains, dtype: object