我有一个python脚本,在ping提供的IPv4和IPv6地址列表时生成一个输出文件,如下所示:
Item Number: [item number]
host = [hostname]
PING [IPv4]
64 bytes from [dst.IP] icmp_seq=1 ttl=57 time=27.3 ms
64 bytes from [dst.IP] icmp_seq=2 ttl=57 time=26.8 ms
64 bytes from [dst.IP] icmp_seq=3 ttl=57 time=21.6 ms
.
.
.
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 21.604/25.248/27.333/2.589 ms
PING [IPv6]
64 bytes from [dst.IP] icmp_seq=1 ttl=61 time=31.3 ms
64 bytes from [dst.IP] icmp_seq=2 ttl=61 time=22.0 ms
64 bytes from [dst.IP] icmp_seq=3 ttl=61 time=22.8 ms
.
.
.
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 22.098/25.480/31.866/4.519 ms
$$$$$
显然,括号内的信息是变量。我的目的是为每个条目的每个IP版本提取主机名,ttl和rtt min值,这些值位于“Item Number”和“$$$$$之间,并将其导出到例如csv文件的一行。” p>
我无法想出一个合适的正则表达式来做这件事。到目前为止我能做的只是提取一个条目号的所有信息,这意味着上面提到的那些标签之间的所有文字:
import re
reader = open('file')
text = reader.read()
match_list = re.findall(r'Item Number:\.(.*?)\${5}' ,text, re.S)
length = len(match_list)
for x in match_list:
print x
此代码为每场比赛留下“项目编号:”和“$$$$$”。
任何帮助都将不胜感激。
答案 0 :(得分:0)
我建议您首先将文件拆分为块,然后按如下方式处理每个块:
import csv
import re
from itertools import chain
with open('file') as f_input, open('output.csv', 'wb') as f_output:
text = f_input.read()
csv_output = csv.writer(f_output)
for block in re.findall(r'Item Number:(.*?)\${5}', text, re.S):
re_hostname = re.search(r'host = (.*)', block)
if re_hostname:
data = re.findall(r'ttl=(\d+) time=([0-9.]+) ', block)
csv_output.writerow([re_hostname.group(1)] + list(chain.from_iterable(data)))
这将导致csv文件如下:
[hostname],57,27.3,57,26.8,57,21.6,61,31.3,61,22.0,61,22.8