我有这样的字符串:
str1 = "Information name: Wen Moyes address: Mcity."
str2 = "resume Name : Sam Win Father's name: Dean address"
str3 = "Father's name: Dan. Acknowledge"
str4 = "Father's Name: Joe Cena Name :- John Cena"
我想从Name之后的每个字符串中提取名称。如果字符串包含Father's name
,则应忽略该部分,而仅查找name
我的预期输出是:
Wen Moyes
Sam Win
None
John Cena
我尝试过的事情:
我在regex
以下使用
re.findall(r'name\s*:(\s*\w*\s\w*)', str1.lower())
这给我的输出为:
[' wen moyes']
[' sam win', ' dean address']
[' dan']
[' joe cena']
我该如何处理?
有没有不使用正则表达式的替代方法?
谢谢!
答案 0 :(得分:1)
一种选择是对Father's
进行负向后看,然后将Name:
与可选的空格/破折号匹配,然后捕获以下(\w+ \w+)
:
str1 = "Information name: Wen Moyes address: Mcity."
str2 = "resume Name : Sam Win Father's name: Dean address"
str3 = "Father's name: Dan. Acknowledge"
str4 = "Father's Name: Joe Cena Name :- John Cena"
pattern = re.compile(r"(?<!Father's )[Nn]ame ?:-? (\w+ \w+)")
for str in [str1, str2, str3, str4]:
print(re.findall(pattern, str))