NVD - 使用Python将JSON转换为CSV

时间:2017-11-15 10:22:32

标签: python json csv

我正在尝试下载NVD CVE。这是我的pythoncode:

import requests
import re

r = requests.get('https://nvd.nist.gov/vuln/data-feeds#JSON_FEED')
for filename in re.findall("nvdcve-1.0-[0-9]*\.json\.zip",r.text):
    print(filename)
    r_file = requests.get("https://static.nvd.nist.gov/feeds/json/cve/1.0/" + filename, stream=True)
    with open("nvd/" + filename, 'wb') as f:
        for chunk in r_file:
            f.write(chunk)

现在我想用以下格式编写csv文件中的所有JSON文件:

Name, Value, Description, ..., ...  
Name, Value, Description, ..., ...

有人能帮助我吗?

1 个答案:

答案 0 :(得分:1)

以下内容应该让您开始,为您提供两列,, VendorName and说明import requests import re import zipfile import io import json import csv with open("nvdcve-1.0-2017.json") as f_json: r = requests.get('https://nvd.nist.gov/vuln/data-feeds#JSON_FEED') with open('output.csv', 'w', newline='') as f_output: csv_output = csv.writer(f_output) csv_output.writerow(['ID', 'VendorName', 'Description', 'VersionValues']) for filename in re.findall("nvdcve-1.0-[0-9]*\.json\.zip", r.text): print("Downloading {}".format(filename)) r_zip_file = requests.get("https://static.nvd.nist.gov/feeds/json/cve/1.0/" + filename, stream=True) zip_file_bytes = io.BytesIO() for chunk in r_zip_file: zip_file_bytes.write(chunk) zip_file = zipfile.ZipFile(zip_file_bytes) for json_filename in zip_file.namelist(): print("Extracting {}".format(json_filename)) json_raw = zip_file.read(json_filename).decode('utf-8') json_data = json.loads(json_raw) for entry in json_data['CVE_Items']: try: vendor_name = entry['cve']['affects']['vendor']['vendor_data'][0]['vendor_name'] except IndexError: vendor_name = "unknown" try: url = entry['cve']['references']['reference_data'][0]['url'] except IndexError: url = '' try: vv = [] for pd in entry['cve']['affects']['vendor']['vendor_data'][0]['product']['product_data']: for vd in pd['version']['version_data']: vv.append(vd['version_value']) version_values = '/'.join(vv) except IndexError: version_values = '' csv_output.writerow([ entry['cve']['CVE_data_meta']['ID'], url, vendor_name, entry['cve']['description']['description_data'][0]['value'], version_values]) VendorValues`:

json.loads()

将zipfile下载到内存中。然后,它将所有文件一次一个地提取到内存中,并使用CVE_Items将json转换为Python数据结构。对于df = pd.read_json(json_raw) df.to_csv(f_output) 中的每个条目,它会提取几个字段并将它们写入CSV文件。

由于JSON数据结构高,您需要考虑如何表示CSV文件中的所有字段。目前它增加了两个“有用”字段并存储了这些字段。

另外,您可以使用Pandas代替制作自己的CSV:

csv_output

删除/Library/Frameworks/Mono.framework/Versions/5.4.1/bin/nuget restore MyProject.sln 行。这虽然需要一些额外的工作来决定它应该如何格式化。