正则表达式匹配它们之间有空格的单词

时间:2014-07-11 08:14:31

标签: python regex

我有这个正则表达式([^\s|:]+):\s*([^\s|:]+),适用于name:jones|location:london|age:23。如何扩展正则表达式模式以覆盖它们之间具有空格的单词,或者单词与数字组合,例如:full name:jones hardy|city and dialling code :london 0044|age:23 years

>>> ("full name", "jones hardy") ("city and dialling code", "london 0044")("age","23 years")

3 个答案:

答案 0 :(得分:2)

这种情况似乎需要re.split

>>> s = "full name:jones hardy|city and dialling " \
...     "code :london 0044|age:23 years"
>>> [tuple(re.split('\s*:\s*', t))
...  for t in re.split('\s*\|\s*', s)]
[('full name', 'jones hardy'),
 ('city and dialling code', 'london 0044'),
 ('age', '23 years')]

答案 1 :(得分:2)

>>> s= "full name:jones hardy|city and dialling code :london 0044|age:23 years"
>>> r=r"([^|:]+?)\s*:\s*([^|:]+)"
>>> re.findall(r, s)
[('full name', 'jones hardy '), ('city and dialling code', 'london 0044'), ('age', '23 years')]

因此,'city and dialling code '末尾的空格将被消除。

但如果有空格强制'|',则不会被删除:

>>> s="full name:jones hardy |city and dialling code :london 0044|age:23 years"
>>> re.findall(r, s)
[('full name', 'jones hardy '), ('city and dialling code', 'london 0044'), ('age', '23 years')]

这将是'jones hardy '末尾的空格。

修改

r"\s*([\w\s]+?)\s*:\s*([\w\s]+?)\s*(?:\||$)"将消除目标字符串开头和结尾的所有空格:

>>> s
'  full name: jones hardy | city and dialling code :london 0044|age:23 years'
>>> r=r"\s*([\w\s]+?)\s*:\s*([\w\s]+?)\s*(?:\||$)"
>>> re.findall(r, s)
[('full name', 'jones hardy'), ('city and dialling code', 'london 0044'), ('age', '23 years')]

答案 2 :(得分:1)

简化你的正则表达式,捕获除分隔符之外的所有内容,在你的情况下是分隔符:或管道|

>>> r = r"([^:|]+)\s*:\s*([^:|]+)"
>>> st = "full name:jones hardy|city and dialling code :london 0044"
>>> re.findall(r, st)
[('full name', 'jones hardy'), ('city and dialling code ', 'london 0044')]
>>> st="name:jones|location:london|age:23"
>>> re.findall(r, st)
[('name', 'jones'), ('location', 'london'), ('age', '23')]