因此,这个问题。我有此专栏:
Column of product names 我需要创建一个函数,在其中接受关键字并返回名称中带有该单词的所有产品。对于此特定问题,关键字为“小麦”。我的功能如下:
def find_word(keyword):
word = []
for i in range(len(df)):
if keyword in df['name'][i]:
word.append(df['name'][i])
return word
find_word("Wheat")
这就是返回的内容:
['小麦奶油(快速)', “脆皮小麦和葡萄干”, “磨砂小麦” “谷物小麦”, “膨化的小麦”, '碎小麦', “小麦麸皮碎”, “小麦勺丝大小”, “草莓水果小麦”, “小麦Chex”, “小麦”, “小麦蜂蜜金”]
您可以看到最后2个以及倒数第3个都不属于。不确定如何构造函数以查找这些情况。
答案 0 :(得分:0)
我认为您可以通过以下re模块来实现您的期望:
import re
def find_word(keyword):
word = []
# create a regular expression pattern that would "exactly" match the keyword
# \b represents Word boundary
p = re.compile(r'\b{}\b'.format(keyword))
for i in range(len(df)):
# use the pattern to search the name
if p.search(df['name'][i]):
word.append(df['name'][i])
return word
find_word("Wheat")
['Cream of Wheat (Quick)', 'Crispy Wheat & Raisins', 'Frosted Mini-Wheat',
'Nutri-grain Wheat', 'Puffed Wheat', 'Shredded Wheat',
"Shredded Wheat'n'Bran", 'Shredded Wheat spoon size', 'Wheat Chex']
答案 1 :(得分:0)
尝试一下。
df[df["name"].str.contains(r'(?:\s|^)Wheat(?:\s|$)')]["name"]