我正在尝试从文本文件中提取以下数据srcintf,dstintf,srcaddr,dstaddr,action,schedule,service,logtraffic
,并将值保存到具有正确行的csv
文件中。
输入文件如下:
edit 258
set srcintf "Untrust"
set dstintf "Trust"
set srcaddr "all"
set dstaddr "10.2.22.1/32"
set action accept
set schedule "always"
set service "selling_soft_01"
set logtraffic all
next
edit 184
set srcintf "Untrust"
set dstintf "Trust"
set srcaddr "Any"
set dstaddr "10.1.1.1/32"
set schedule "always"
set service "HTTPS"
set logtraffic all
next
edit 124
set srcintf "Untrust"
set dstintf "Trust"
set srcaddr "Any"
set dstaddr "172.16.77.1/32"
set schedule "always"
set service "ping"
set logtraffic all
set nat enable
next
这是我的第一次编程(从代码中可以看到),但是也许您可以了解有关我正在尝试执行的操作的更多信息。参见下面的代码。
import csv
text_file = open("fwpolicy.txt", "r")
lines = text_file.readlines()
mycsv = csv.writer(open('output.csv', 'w'))
mycsv.writerow(['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule', 'service', 'logtraffic', 'nat'])
n = 0
for line in lines:
n = n + 1
n = 0
for line in lines:
n = n + 1
if "set srcintf" in line:
srcintf = line
else srcintf = 'not set'
if "set dstintf" in line:
dstintf = line
else dstintf = 'not set'
if "set srcaddr" in line:
srcaddr = line
else srcaddr = 'not set'
if "set dstaddr" in line:
dstaddr = line
else dstaddr = 'not set'
if "set action" in line:
action = line
else action = 'not set'
if "set schedule" in line:
schedule = line
else schedule = 'not set'
if "set service" in line:
service = line
else service = 'not set'
if "set logtraffic" in line:
logtraffic = line
else logtraffic = 'not set'
if "set nat" in line:
nat = line
else nat = 'not set'
mycsv.writerow([srcintf, dstintf, srcaddr, dstaddr, schedule, service, logtraffic, nat])
预期结果(CSV文件):
srcintf,dstintf,srcaddr,dstaddr,schedule,service,logtraffic,nat
"Untrust","Trust","all","10.2.22.1/32","always","selling_soft_01",all,,
实际结果:
Traceback (most recent call last):
File "parse.py", line 45, in <module>
mycsv.writerow([srcintf, dstintf, srcaddr, dstaddr, schedule, service, logtraffic, nat])
NameError: name 'srcintf' is not defined
答案 0 :(得分:1)
您正在尝试为文件中的每一行向csv写一行。
您应该仅在看到单词next
时才写该行,因此请在写之前进行检查,以完全收集每一行的条件。
到此为止,您会注意到您已将值设置为整行,而不是字符串后的所需值。 例如与线
set srcintf "Untrust"
您的代码
if "set srcintf" in line: srcintf = line
else srcintf = 'not set'
将为srcintf
赋予值set srcintf "Untrust"
。尝试split
字符串以找到实际值吗?
...类似这样:
text_file = open("fwpolicy.txt", "r")
lines = text_file.readlines()
mycsv = csv.writer(open('output.csv', 'w'))
mycsv.writerow(['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule',
'service', 'logtraffic', 'nat'])
for line in lines:
if "edit" in line:
[srcintf, dstintf, srcaddr, dstaddr, schedule,
service, logtraffic, nat] = ['not set']*8
elif 'next' in line:
mycsv.writerow([srcintf, dstintf, srcaddr, dstaddr, schedule, service, logtraffic, nat])
elif "set srcintf" in line:
srcintf = line.split()[2]
elif "set dstintf" in line:
dstintf = line.split()[2]
elif "set srcaddr" in line:
srcaddr = line.split()[2]
elif "set dstaddr" in line:
dstaddr = line.split()[2]
elif "set action" in line:
action = line.split()[2]
elif "set schedule" in line:
schedule = line.split()[2]
elif "set service" in line:
service = line.split()[2]
elif "set logtraffic" in line:
logtraffic = line.split()[2]
elif "set nat" in line:
nat = line.split()[2]
重要的是填充一行中的所有值,并且只有在拥有它们时才进行写。 可以使重复变得更整洁,但是希望这对状态机的想法有所帮助-查看文件中的位置,以确定是收集值,开始新手还是写一行。
答案 1 :(得分:1)
如何使用DictWriter
with open("fwpolicy.txt", "r") as text_file, open('output.csv', 'w', newline='') as out_file:
fieldnames = ['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule',
'service', 'logtraffic', 'nat']
mycsv = csv.DictWriter(out, fieldnames=fieldnames, extrasaction='ignore',
quotechar=None, quoting=csv.QUOTE_NONE)
mycsv.writeheader()
row = {}
for line in text_file:
words = line.strip().split(maxsplit=2)
if 'set' == words[0]:
row[words[1]] = words[2]
elif 'next' == words[0]:
print(row)
mycsv.writerow(row)
row = {}
答案 2 :(得分:0)
这是我的处理方法:
import csv
text_file = open("structured_content.txt", "r")
lines = "\n".join(text_file.readlines())
fieldnames = ['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule', 'service', 'logtraffic', 'nat']
defaults = {'srcintf' : "not set", 'dstintf': "not set", 'srcaddr': "not set",
'dstaddr': "not set", 'schedule': "not set", 'service': "not set",
'logtraffic': "not set", 'nat': "not set"}
mycsv = csv.DictWriter(open('output.csv', 'w'), fieldnames)
for block in lines.split("next"):
csv_row = {}
for p in [(s.strip()) for s in block.replace("\n", "").split("set")]:
s = p.split()
if len(s)==2:
csv_row[s[0]]=s[1] # n.b. this includes "action" and "edit" fields, which need stripping out
csv_write_row = {}
for k,v in csv_row.items():
print ( "key=",k,"value=",v )
if k in fieldnames: # a filter to only include fields in the "fieldnames" list
print ( k , " is in the list - attach its value to the output dictionary")
csv_write_row[k]=v
for k,v in defaults.items():
if k not in csv_write_row.keys(): # pad-out the output row with any default values not lifted from the file
print ( k , " is not in the list - write a default out")
csv_write_row[k]=v
mycsv.writerow(csv_write_row)
我的目标是利用文件的结构,并使用split
命令将文本字符串分解为重复的块。将文件转换为csv只是将块(和嵌套块)对齐为csv格式的问题。 csv.DictWriter
提供了一个有用的界面,用于逐行保存您的内容。
如果您要为不存在的值设置默认值,则可以使用包含字段名称键和默认(缺失)值的字典来实现。如果不存在这些默认值,则可以用这些默认值“清洗”准备好的csv_write_row。
答案 3 :(得分:0)
这是一种实现方法:
keys = ['srcintf', 'dstintf', 'srcaddr', 'dstaddr', 'schedule', 'service', 'logtraffic', 'nat']
lines
records = []
for line in lines:
found_key = [key for key in keys if key in line]
if len(found_key) >0:
value = line.strip().rstrip("\n\r").replace('"', '').split(" ")[2: ]
record[found_key[0]] = value[0]
if 'next' in line:
records.append(record)
record = dict()
pd.DataFrame(records).to_csv('output.csv', index=False)