Question

我正在开发一个Python脚本，它将Nessus数据导出为CSV并删除重复数据，但是由于导出工作结果的方式不同的端口和协议有自己独特的行，即使所有其他行中的数据是相同的。我需要删除这些重复项，但我想保留Port和Protocol列数据并将其附加到上一行。

这是一个非常小的CSV，用于测试和构建脚本：

Screenshot of CSV File

正如您所看到的，除了端口字段之外，所有字段都是完全相同的，有时协议字段也会不同，所以我需要读取CSV文件的两行，然后像这样添加端口：80,443与协议相同：tcp，tcp

然后只保存一行以删除重复数据，我已经尝试通过检查是否已经存在插件ID的实例来执行此操作，但是我的输出仅打印第二行Port和Protocol。

protocollist = []
portlist = []
pluginid_list = []
multiple = False 

with open(csv_file_input, 'rb') as csvfile:
    nessusreader = csv.DictReader(csvfile)
    for row in nessusreader:
        pluginid = row['Plugin ID']
        if pluginid != '':
            pluginid_list.append(row['Plugin ID'])
            print(pluginid_list)
        count = pluginid_list.count(pluginid)
        cve = row['CVE']
        if count > 0:
            protocollist.append(row['Protocol'])
            print(protocollist)
            portlist.append(row['Port'])
            print(portlist)
            print('Counted more than 1')
            multiple = True
        if multiple == True:
            stringlist = ', '.join(protocollist)
            newstring1 = stringlist
            protocol = newstring1
            stringlist2 = ', '.join(portlist)
            newstring2 = stringlist2
            port = newstring2
        else:
            protocol = row['Protocol']
            port = row['Port']
        cvss = row['CVSS']
        risk = row['Risk']
        host = row['Host']
        name = row['Name']
        synopsis = row['Synopsis']
        description = row['Description']
        solution = row['Solution']
        seealso = row['See Also']
        pluginoutput = row['Plugin Output']

with open(csv_file_output, 'w') as csvfile:
    fieldnames = ['Plugin ID', 'CVE', 'CVSS', 'Risk', 'Host', 'Protocol', 'Port', 'Name', 'Synopsis', 'Description', 'Solution', 'See Also', 'Plugin Output']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

    writer.writeheader()
    writer.writerow({'Plugin ID': pluginid, 'CVE': cve, 'CVSS': cvss, 'Risk': risk, 'Host': host, 'Protocol': protocol, 'Port': port, 'Name': name, 'Synopsis': synopsis, 'Description': description, 'Solution': solution, 'See Also': seealso, 'Plugin Output': pluginoutput})

代码中可能存在一些错误，因为我一直在尝试不同的事情，但只是想展示我一直在努力为此问题提供更多背景信息的代码。如果数据仅在CSV中显示，因为只有两个项目，此代码有效，但是我引入了具有不同插件ID的第三组数据，然后将其添加到列表中，可能是由于if语句是设为＆gt; 0

使用Python

0 个答案: