我有这个正则表达式([^\s|:]+):\s*([^\s|:]+)
,适用于name:jones|location:london|age:23
。如何扩展正则表达式模式以覆盖它们之间具有空格的单词,或者单词与数字组合,例如:full name:jones hardy|city and dialling code :london 0044|age:23 years
>>> ("full name", "jones hardy") ("city and dialling code", "london 0044")("age","23 years")
答案 0 :(得分:2)
这种情况似乎需要re.split
。
>>> s = "full name:jones hardy|city and dialling " \
... "code :london 0044|age:23 years"
>>> [tuple(re.split('\s*:\s*', t))
... for t in re.split('\s*\|\s*', s)]
[('full name', 'jones hardy'),
('city and dialling code', 'london 0044'),
('age', '23 years')]
答案 1 :(得分:2)
>>> s= "full name:jones hardy|city and dialling code :london 0044|age:23 years"
>>> r=r"([^|:]+?)\s*:\s*([^|:]+)"
>>> re.findall(r, s)
[('full name', 'jones hardy '), ('city and dialling code', 'london 0044'), ('age', '23 years')]
因此,'city and dialling code '
末尾的空格将被消除。
但如果有空格强制'|'
,则不会被删除:
>>> s="full name:jones hardy |city and dialling code :london 0044|age:23 years"
>>> re.findall(r, s)
[('full name', 'jones hardy '), ('city and dialling code', 'london 0044'), ('age', '23 years')]
这将是'jones hardy '
末尾的空格。
r"\s*([\w\s]+?)\s*:\s*([\w\s]+?)\s*(?:\||$)"
将消除目标字符串开头和结尾的所有空格:
>>> s
' full name: jones hardy | city and dialling code :london 0044|age:23 years'
>>> r=r"\s*([\w\s]+?)\s*:\s*([\w\s]+?)\s*(?:\||$)"
>>> re.findall(r, s)
[('full name', 'jones hardy'), ('city and dialling code', 'london 0044'), ('age', '23 years')]
答案 2 :(得分:1)
简化你的正则表达式,捕获除分隔符之外的所有内容,在你的情况下是分隔符:
或管道|
>>> r = r"([^:|]+)\s*:\s*([^:|]+)"
>>> st = "full name:jones hardy|city and dialling code :london 0044"
>>> re.findall(r, st)
[('full name', 'jones hardy'), ('city and dialling code ', 'london 0044')]
>>> st="name:jones|location:london|age:23"
>>> re.findall(r, st)
[('name', 'jones'), ('location', 'london'), ('age', '23')]