我试图从文本文件中提取并解析单词(主机名和版本)。当我运行我的代码时,它将数据写入csv文件,但输出看起来不同。
**my input file is .txt and below is the content**
Hostname Router1
version 15.01
code:
line console 0
logging synchronous
exec-timeout 15 1
usb-inactivity-timeout 15
exec prompt timestamp
transport preferred none
Hostname Router2
version 15.02
line vty 0 15
logging synchronous
exec-timeout 15 2
exec prompt timestamp
transport input ssh
transport preferred none
access-class REMOTE_ACCESS in
Hostname Router3
version 15
line console 0
logging synchronous
exec-timeout 15 3
usb-inactivity-timeout 15
exec prompt timestamp
transport preferred none
Hostname Router3
version 15.12
line vty 0 15
logging synchronous
exec-timeout 15 4
exec prompt timestamp
transport input ssh
transport preferred none
access-class REMOTE_ACCESS in
**Above is the sample content in my input text file**
$
import re
import csv
with open('sample5.csv','w',newline='') as output:
HeaderFields = ['Hostname','version']
writer = csv.DictWriter(output,fieldnames=HeaderFields)
writer.writeheader()
with open('testfile.txt','r',encoding='utf-8') as input:
for line in input.readlines():
pattern = re.compile(r'Hostname(.*)''|''version(.*)')
match=pattern.finditer(line)
for match1 in match:
with open('sample5.csv', 'a',newline='') as output:
writer = csv.DictWriter(output, fieldnames=HeaderFields)
writer.writerow({'Hostname': match1.group(1), 'version':
match1.group(2)})
我在csv中的预期结果如下:
谢谢。
答案 0 :(得分:2)
您的代码失败,因为在每次迭代中您只读取一行(可以包含主机或版本但不包含两者,但您将数据写入csv。让我们在匹配twoliners时迭代所有文本: 第一行主机名..和第二行版本... \ n作为Windows的换行符(我听说Mac使用\ r不确定)。既然你匹配twoliners,你可以从同一个匹配对象中获取路由器和版本。
with open('testfile.txt','r',encoding='utf-8') as input:
txt = input.read()
pattern = re.compile(r'Hostname (.*)(\r\n?|\n)version (.*)')
match=pattern.finditer(txt)
for match1 in match:
with open('sample5.csv', 'a',newline='') as output:
writer = csv.DictWriter(output, fieldnames=HeaderFields)
writer.writerow({'Hostname': match1.group(1), 'version':
match1.group(3)})