我正在开发一个Python脚本,它将Nessus数据导出为CSV并删除重复数据,但是由于导出工作结果的方式不同的端口和协议有自己独特的行,即使所有其他行中的数据是相同的。我需要删除这些重复项,但我想保留Port和Protocol列数据并将其附加到上一行。
这是一个非常小的CSV,用于测试和构建脚本:
正如您所看到的,除了端口字段之外,所有字段都是完全相同的,有时协议字段也会不同,所以我需要读取CSV文件的两行,然后像这样添加端口:80,443与协议相同:tcp,tcp
然后只保存一行以删除重复数据,我已经尝试通过检查是否已经存在插件ID的实例来执行此操作,但是我的输出仅打印第二行Port和Protocol。
protocollist = []
portlist = []
pluginid_list = []
multiple = False
with open(csv_file_input, 'rb') as csvfile:
nessusreader = csv.DictReader(csvfile)
for row in nessusreader:
pluginid = row['Plugin ID']
if pluginid != '':
pluginid_list.append(row['Plugin ID'])
print(pluginid_list)
count = pluginid_list.count(pluginid)
cve = row['CVE']
if count > 0:
protocollist.append(row['Protocol'])
print(protocollist)
portlist.append(row['Port'])
print(portlist)
print('Counted more than 1')
multiple = True
if multiple == True:
stringlist = ', '.join(protocollist)
newstring1 = stringlist
protocol = newstring1
stringlist2 = ', '.join(portlist)
newstring2 = stringlist2
port = newstring2
else:
protocol = row['Protocol']
port = row['Port']
cvss = row['CVSS']
risk = row['Risk']
host = row['Host']
name = row['Name']
synopsis = row['Synopsis']
description = row['Description']
solution = row['Solution']
seealso = row['See Also']
pluginoutput = row['Plugin Output']
with open(csv_file_output, 'w') as csvfile:
fieldnames = ['Plugin ID', 'CVE', 'CVSS', 'Risk', 'Host', 'Protocol', 'Port', 'Name', 'Synopsis', 'Description', 'Solution', 'See Also', 'Plugin Output']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'Plugin ID': pluginid, 'CVE': cve, 'CVSS': cvss, 'Risk': risk, 'Host': host, 'Protocol': protocol, 'Port': port, 'Name': name, 'Synopsis': synopsis, 'Description': description, 'Solution': solution, 'See Also': seealso, 'Plugin Output': pluginoutput})
代码中可能存在一些错误,因为我一直在尝试不同的事情,但只是想展示我一直在努力为此问题提供更多背景信息的代码。如果数据仅在CSV中显示,因为只有两个项目,此代码有效,但是我引入了具有不同插件ID的第三组数据,然后将其添加到列表中,可能是由于if语句是设为> 0