Question

所以，我仍然是regex和python的新手。我一直在寻找，但不知道如何问我在寻找什么。

我需要将格式化字符串中的数据放入列表或字典列表中。

-------------------------------------------------------------------
Frank         114      0         0         0          0         114       
Joe           49       1         0         0          0         50        
Bob           37       0         0         0          0         37        
Sally         34       2         0         0          0         36

这是脚本的输出。目前我有：

match_list = []
match = re.search('\n(\w+)\s+(\d*)\s+(\d*)', output)
  if match:
    match_list.append([match.group(1),
                       match.group(2),
                       match.group(3)])
>>>print match_list
[['frank', '114', '0']]

这是完美的，除了我需要match_list返回：

[['frank', '114', '0'],
 ['Joe', '49', '1'],
 ['Bob', '37', '0'],
 ['Sally', '34', '2']]

我最初的想法是for循环，并检查match.group（1）是否已经列出，如果是，那么移动到下一个，但后来我意识到我不知道该怎么做。但是你现在有了。我很难搞清楚这一点。任何帮助都会很棒！：）

哦也。列表大小更改。有时可能只有一个用户，有时可能有20个用户。所以我可以设置一个巨大的静态正则表达式。（我知道......）

Answer 1

您可以使用re.findall：

match_list = []
match = re.findall('\n(\w+)\s+(\d*)\s+(\d*)', output)
for k in match:
    #k will be a tuple like this: ('frank', '114', '0')
    match_list.append(list(k))

或与oneliner相同的解决方案：

match_list = map(list, re.findall('\n(\w+)\s+(\d*)\s+(\d*)', output))

Answer 2

您不需要正则表达式：

table="""\
-------------------------------------------------------------------
Frank         114      0         0         0          0         114       
Joe           49       1         0         0          0         50        
Bob           37       0         0         0          0         37        
Sally         34       2         0         0          0         36"""

print [line.split() for line in table.splitlines()[1:]]

或者，如果你想要一个正则表达式：

print [list(t) for t in re.findall(r'^(\w+)'+r'\s+(\d+)'*6,table,re.MULTILINE)]

无论是哪种情况，打印：

[['Frank', '114', '0', '0', '0', '0', '114'], 
 ['Joe', '49', '1', '0', '0', '0', '50'], 
 ['Bob', '37', '0', '0', '0', '0', '37'], 
 ['Sally', '34', '2', '0', '0', '0', '36']]

Python - 使用正则表达式获取用户数据

2 个答案: