Python将CSV中多行的(连接)数据合并到另一个CSV

时间:2015-07-21 14:49:01

标签: python sorting csv ip concatenation

我每天都在运行一些nmap扫描报告,我正在尝试使用Python完全自动化。我有一个带IP的CSV和一个端口号,每行一个。我试图将端口号合并为一个列表。下面是输入csv的示例:

    address       port
    192.168.3.5   80
    192.168.3.5   443
    192.168.3.5   3389
    192.168.3.5   137
    192.168.4.77  80
    192.168.4.77  445

输出应如下所示:

    address         ports
    192.168.3.5     80, 443, 3389, 137
    192.168.4.77    80,445

这是一个完整的脚本:

import subprocess

# Function to run peepingtom
def run_peepingtom(dir):

    scanfile = dir + '/nmap-scan.xml'

    subprocess.call(["python", "peepingtom/peepingtom.py", "-x", scanfile, "-o", dir + "/peepcaptures/"])


# Function to run NMAP on a list of IPs. The scan results will be in "dir" location
def run_nmap(dir):

    targets = dir + '/targets.txt'

    subprocess.call(["nmap", "-vv", "-A", "-sV", "-Pn", "-T4", "-iL", targets, "-oA", dir + "/nmap-scan"])

    # Create an HTML report
    subprocess.call(["xsltproc", dir + "/nmap-scan.xml", "-o", dir + "/nmap-scan.html"])


# Function to convert NMAP output to CSV
def run_nmap_parser(dir):

    scanfile = dir + '/nmap-scan.xml'

    subprocess.call(["python", "nmap-parser-xml-to-csv/nmap-parser-xml-to-csv.py", scanfile, "-s", ",", "-o", dir + "/nmap-scan.csv"])


def main():

    outputdir= '2015-07-20'

    run_nmap(outputdir)

    run_peepingtom(outputdir)

    run_nmap_parser(outputdir)


if __name__ == '__main__':
    main()

我编写了一个Python脚本来执行扫描并创建CSV输出等。我使用了很少的开源工具来获取我需要的东西。之后我需要做更多的手动格式化,这就是我想要自动化的。我的Python技能非常有限,所以任何帮助都值得赞赏,从哪里开始?

1 个答案:

答案 0 :(得分:1)

以下脚本可以处理您的输入CSV文件。它会读取CSV报告登录的每一行,并为每个IP地址将其添加到字典中。每个字典条目包含用于给定IP地址的set个端口。输出按IP地址排序。

import csv, collections, socket

d_ip = collections.defaultdict(set)

with open("report_log.csv", "r") as f_input:
    csv_input = csv.reader(f_input, skipinitialspace=True)
    headers = next(csv_input)

    for row in csv_input:
        d_ip[row[0]].add(row[1])
        #d_ip[row[0]].append(row[1])   # if a list is preferred

with open("port_usage.csv", "wb") as f_output:
    csv_output = csv.writer(f_output)
    csv_output.writerow(headers)
    print "%-20s %s" % (headers[0], headers[1])

    # Sort by IP address
    ip_sorted = d_ip.keys()
    ip_sorted.sort(key=lambda x: socket.inet_aton(x))

    for ip in ip_sorted:
        l_ports = list(d_ip[ip])
        l_ports.sort(key=lambda x: int(x))
        csv_output.writerow([ip, ", ".join(l_ports)])
        print "%-20s %s" % (ip, ", ".join(l_ports))

将打印以下输出:

address              port
192.168.3.5          80, 137, 443, 3389
192.168.4.77         80, 445

如果需要所有端口(不仅仅是唯一端口),只需更改为defaultdict(list),将.add()更改为.append()并注释掉sort