我在数据框中有两列都是字符串,其中column1在column2中有一些匹配的关键字。我想从新列中的column1和column2中提取那些匹配的关键字。
df['column3']=df.column1.apply(lambda x : df.column2[df.column2.str.contains(x)]
我期望这样的输出
column1 column2 column3
A girl is going to market girl market school girl market
A girl is going to school girl market school girl school
The sky is blue in color sky blue orange color sky blue color
答案 0 :(得分:4)
使用apply
例如:
df["column3"] = df.apply(lambda x: " ".join(i for i in x["column2"].split() if i in x["column1"]),axis=1)
print(df)
输出:
column1 column2 column3
0 A girl is going to market girl market school girl market
1 A girl is going to school girl market school girl school
2 The sky is blue in color sky blue orange color sky blue color
答案 1 :(得分:3)
df['column3'] = df.apply(lambda x: ' '.join(np.intersect1d(x['column1'].split(),x['column2'].split())), axis=1)
输出
column1 column2 column3
0 A girl is going to market girl market school girl market
1 A girl is going to school girl market school girl school
2 The sky is blue in color sky blue orange color blue color sky
如果订单重要
df['column3'] = df.apply(lambda x: ' '.join(np.array(x['column1'].split())[np.in1d(x['column1'].split(),x['column2'].split())]), axis=1)
输出
column1 column2 column3
0 A girl is going to market girl market school girl market
1 A girl is going to school girl market school girl school
2 The sky is blue in color sky blue orange color sky blue color
答案 2 :(得分:1)
使用&
的交点(sets
)的另一种解决方案:
df['column3'] = df.apply(lambda x: ' '.join(set(x['column1'].split()) &
set(x['column2'].split())), axis=1)
[出]
column1 column2 column3
0 A girl is going to market girl market school market girl
1 A girl is going to school girl market school girl school
2 The sky is blue in color sky blue orange color sky color blue