我有一列多行文字。例如:
“星期六早上真好”
“昨天是星期五”
“他晚上去商店了”
如何从每行中提取某些关键字,例如星期几(“星期一”,“星期二”,“星期三”等)并将其存储在新列中。我宁愿不循环。也许使用lambda函数?
答案 0 :(得分:2)
不需要lambda,这里是findall
#l=['Monday',...'Sunday'] define you own list
df['Newcol']=df.Date.str.findall('|'.join(l))
0 [Saturday]
1 [Friday]
2 []
Name: Date, dtype: object
答案 1 :(得分:0)
您可以为此使用pandas.Series.str.extract
:
words = ['Saturday', 'Friday']
df['Word'] = df['Strings'].str.extract(r'({})'.format('|'.join(words)))
print(df)
Strings Word
0 Saturday morning was nice Saturday
1 Yesterday was Friday Friday
2 He went to the store at night NaN