Question

我试图从程序的输出中提取os（Linux 3.11 and newer）的值。我想出了这个：

import re

p0f = '''
--- p0f 3.08b by Michal Zalewski <lcamtuf@coredump.cx> ---

[+] Closed 3 file descriptors.
[+] Loaded 324 signatures from '/etc/p0f/p0f.fp'.
[+] Will read pcap data from file 'temp.pcap'.
[+] Default packet filtering configured [+VLAN].
[+] Processing capture data.

.-[ 10.0.7.20/37462 -> 216.58.209.229/443 (syn) ]-
|
| client   = 10.0.7.20/37462
| os       = Linux 3.11 and newer
| dist     = 0
| params   = none
| raw_sig  = 4:64+0:0:1460:mss*20,7:mss,sok,ts,nop,ws:df,id+:0
|
`----

.-[ 10.0.7.20/37462 -> 216.58.209.229/443 (mtu) ]-
|
| client   = 10.0.7.20/37462
| link     = Ethernet or modem
| raw_mtu  = 1500
|
`----


All done. Processed 1 packets.
'''


print p0f
os = re.match(r"os\\s*= (.*)", p0f).group(1)
print os

根据这个Regex101，我的正则表达应该是正确的。但我收到错误NoneType has no 'group'。

Answer 1

你有两个问题：

您正在使用re.match()，您应该使用re.search()。 re.match()仅匹配字符串的 start 。请参阅模块文档中的search() vs. match()。
您在\\元字符上加倍\s反斜杠，但使用的是r'..'原始字符串文字。

这有效：

re.search(r"os\s*= (.*)", p0f)

演示：

>>> import re
>>> re.search(r"os\s*= (.*)", p0f).group(1)
'Linux 3.11 and newer'

Answer 2

如果您使用r，请不要逃避\。这有效：

re.search(r"os\s*= (.*)", p0f).group(1)

从有效的正则表达式查询中获取无类型错误

2 个答案: