我想使用re.findall()
来匹配评论网站中的公司名称实例。例如,我想捕获列表中的名称,如以下示例中的名称:
website_html = ', Jimmy Bob's Tune & Lube, Allen's Bar & Grill, Joanne's - Restaurant,'
name_list = re.findall('[,]\s*([\w\'&]*\s?)*[,]', website_html)
我的代码没有捕获任何模式。有什么想法吗?
答案 0 :(得分:1)
您仅提供了一个输入示例,因此此答案是基于您的问题的:
# I replace the single quotes at the start and end of your input, because
# Bob's throws a SyntaxError: invalid syntax
#
website_html = ", Jimmy Bob's Tune & Lube,"
# I removed re.findall, because you only had one example so re.search or
# re.match works.
name_list = re.search(r'[,]\s*([\w\'&]*\s?)*[,]', website_html)
print (name_list.group(0))
# output
, Jimmy Bob's Tune & Lube,
如果您在website_html中还有其他输入值,请提供它们,以便我修改答案。
这是使用re.findall的版本。
# I replace the single quotes at the start and end of your input, because
# Bob's throws a SyntaxError: invalid syntax
#
website_html = ", Jimmy Bob's Tune & Lube,"
# I wrapped your pattern as a capture group
name_list = re.findall(r'([,]\s*([\w\'&]*\s?)*[,])', website_html)
print (type(name_list))
# output
<class 'list'>
print (name_list)
# output
[(", Jimmy Bob's Tune & Lube,", '')]
更新后的答案
此答案基于对原始问题的修改输入。
website_html = ", Jimmy Bob's Tune & Lube, Allen's Bar & Grill, Joanne's - Restaurant,"
name_list = re.findall(r'[?:,].*[?:,]', website_html)
for item in name_list:
split_strings = (str(item).split(','))
for string in split_strings:
print (string)
# output
Jimmy Bob's Tune & Lube
Allen's Bar & Grill
Joanne's - Restaurant