我有一个正则表达式,我想要匹配几件事: 以下是我已经开始的示例和代码的链接,但是对于我在正则表达式中无法确定的错误,我们没有认识到某些行:http://regex101.com/r/oL4bB5/1
字符串示例:
eg1:Tommy Berry
eg2:Ms Winona Costin (a3/47kg)
eg3:Ms Kathy O'Hara
在python中使用findall的最终结果:
eg1:[' Tommy Berry']
eg2:[' Ms',' Winona Costin',' 3',' 47']
eg3:[' Ms',' Kathy O' Hara']
如您所见,我想隔离字符串开头的Ms
,括号内的数字并保持全名。
感谢您的帮助,谢谢!
修改
名称可能包含数字和特殊字符,例如'-. etc.
:
例如:Samuel L. Jackson-Pitt
答案 0 :(得分:1)
我想你想要这样的东西,
^(Ms)?\s*([\w '-]+)(?= \(|$)(?: *\(\D*(\d+)\D*(\d+)[^\n]*)?$
>>> import re
>>> s = """Brodie Loy (a3/53kg)
Hugh Bowman
Ms Winona Costin (a3/47kg)
James McDonald
Ms Kathy O'Hara"""
>>> m = re.findall(r"^(Ms)?\s*([\w '-]+)(?= \(|$)(?: *\(\D*(\d+)\D*(\d+)[^\n]*)?$", s, re.M)
>>> m
[('', 'Brodie Loy', '3', '53'), ('', 'Hugh Bowman', '', ''), ('Ms', 'Winona Costin', '3', '47'), ('', 'James McDonald', '', ''), ('Ms', "Kathy O'Hara", '', '')]
>>> [tuple(s for s in tup if s) for tup in m]
[('Brodie Loy', '3', '53'), ('Hugh Bowman',), ('Ms', 'Winona Costin', '3', '47'), ('James McDonald',), ('Ms', "Kathy O'Hara")]
答案 1 :(得分:1)