Question

这是要解析的行：001000000 +3 12091992 +2 0200 +3

我用过：

Z = re.compile('(?P<stop_id>\d{9}) (?P<time_displacement>([-|+]\d{0,4})*)', flags=re.UNICODE)

m = Z.search('001000000 +3 12091992 +2 0200 +3')
if m:
    yield {
           'stop_id': m.group('stop_id')
          }
    if m.group('time_displacement'):
        _suffix=_suffix + 1
        yield {
               'time_displacement' + str(_suffix): m.group('time_displacement')
              }

结果如下：

[{'stop_id': '001000000'}, {'time_displacement': '+3'}]

但我需要：

[{'stop_id': '001000000'}, {'time_displacement1': '+3'},{'time_displacement2': '+2'},{'time_displacement1': '+3'}]

Answer 1

您为什么使用yield？据推测，您发布的代码是生成器函数的一部分。请考虑发布实际可运行的代码片段......

为什么你想要一个单个元素dict的列表，而不是把所有东西都放到一个dict ???

来自您的代码＆amp;数据样本我并不完全确定应该和＆amp;不应该匹配，但希望这能满足您的需求......或者相当接近。：）

无论如何，你可以这样做：

import re

fields = ('stop_id', 'time_displacement')
pat = re.compile(r'(\d{9})|([-|+]\d{0,4})')

data = '001000000 +3 12091992 +2 0200 +3'
found = pat.findall(data)
#print found

result = []
suffix = 1
for p1, p2 in found:
    if p2 == '':
        result.append({fields[0]: p1})
    elif p1 == '':
        result.append({fields[1]+str(suffix): p2})
        suffix += 1

print result

<强>输出

[{'stop_id': '001000000'}, {'time_displacement1': '+3'}, {'time_displacement2': '+2'}, {'time_displacement3': '+3'}]

Answer 2

(?P<stop_id>\d{9})|(?P<time_displacement>(?:[-|+]\d{0,4}))

试试这个。看演示。

http://regex101.com/r/nA6hN9/1

解析一行中重复出现的+ d

2 个答案: