Question

我的代码确定文件的内容返回True还是False并将结果输出到.csv。

我想将文件名也写到同一行的csv中。

错误消息

for i in range(vt_result_file):
NameError: name 'vt_result_file' is not defined

代码

import os
import json
import csv

path=r'./output/'
csvpath='C:/Users/xxx/Documents/csvtest'
file_n = 'file.csv'

def vt_result_check(path):
    vt_result = False
    for filename in os.listdir(path):
        with open(path + filename, 'r') as vt_result_file:
            vt_data = json.load(vt_result_file)

        # Look for any positive detected referrer samples
        # Look for any positive detected communicating samples
        # Look for any positive detected downloaded samples
        # Look for any positive detected URLs
        sample_types = ('detected_referrer_samples', 'detected_communicating_samples',
                        'detected_downloaded_samples', 'detected_urls')
        vt_result |= any(sample['positives'] > 0 for sample_type in sample_types
                                                 for sample in vt_data.get(sample_type, []))

        # Look for a Dr. Web category of known infection source
        vt_result |= vt_data.get('Dr.Web category') == "known infection source"

        # Look for a Forecepoint ThreatSeeker category of elevated exposure
        # Look for a Forecepoint ThreatSeeker category of phishing and other frauds
        # Look for a Forecepoint ThreatSeeker category of suspicious content
        threats = ("elevated exposure", "phishing and other frauds", "suspicious content")
        vt_result |= vt_data.get('Forcepoint ThreatSeeker category') in threats

    return str(vt_result)


if __name__ == '__main__':
    with open(file_n, 'w') as output:
        for i in range(vt_result_file):
            output.write(vt_result_file, vt_result_check(path))

Answer 1

vt_result_file仅作为vt_result_check的 local 变量存在，您的错误是说此变量在文件底部不存在。

此外，（尽管没关系）您在调用创建该变量的函数之前先引用该变量。

在主功能区域中没有任何要遍历的内容。并且您的check函数仅返回一个值。

因此，您只能写出一个CSV行

if __name__ == '__main__':
    with open(file_n, 'w') as output:
        writer = csv.writer(output)
        writer.writerow([file_n, vt_result_check(path)])

修改

关于您的评论，您想要这样的东西

with open(file_n, 'w') as output:  # Open the CSV file
    writer = csv.writer(output)
    for filename in os.listdir(path):  # Loop over all files to check
        full_filename = path + filename
        with open(full_filename, 'r') as vt_result_file: 
            # Load the file and check it 
            vt_data = json.load(vt_result_file)
            writer.writerow([full_filename, check_file(full_filename)])

Answer 2

出现错误是因为尝试使用变量vt_result_file，该变量不在范围内-在试图访问它的脚本部分中不存在该变量in。您正在尝试在这里使用它，大概是要遍历一组文件：

    for i in range(vt_result_file):
        output.write(vt_result_file, vt_result_check(path))

但是vt_result_file仅存在于vt_result_check函数中，当您尝试在for i in range(vt_result_file)中使用该变量时甚至没有调用它。

由于vt_result_check函数会遍历目录中的所有文件，因此您也在重复工作，因此无需执行相同的操作即可获得结果。

似乎在遍历文件并将内容设置为vt_data时，您的主要功能似乎也无法正常工作，但是您仅对最后一组数据做进一步分析：

    with open(path + filename, 'r') as vt_result_file:
        # this is done for every file
        vt_data = json.load(vt_result_file)

    # check the indentation level of the code
    # it'll only run on the last value of vt_data
    sample_types = ('detected_referrer_samples', 'detected_communicating_samples',
                    'detected_downloaded_samples', 'detected_urls')
    vt_result |= any(sample['positives'] > 0 for sample_type in sample_types
                                             for sample in vt_data.get(sample_type, []))

我认为您可能想对每个文件运行分析代码，然后为每个文件保存结果。一种简单的方法是使用字典-请参见https://docs.python.org/3/library/stdtypes.html#mapping-types-dict，这是有关如何重组代码的建议：

def vt_result_check(path):

    # initialise an empty dictionary for your results
    results = {}

    sample_types = ('detected_referrer_samples', 'detected_communicating_samples',
                            'detected_downloaded_samples', 'detected_urls')
    threats = ("elevated exposure", "phishing and other frauds", "suspicious content")

    for filename in os.listdir(path):
        with open(path + filename, 'r') as vt_result_file:
            # set this to false for each file
            vt_result = False
            vt_data = json.load(vt_result_file)

            # do all your analysis here
            vt_result |= any(sample['positives'] > 0 for sample_type in sample_types
                                                     for sample in vt_data.get(sample_type, []))

            # Look for a Dr. Web category of known infection source
            vt_result |= vt_data.get('Dr.Web category') == "known infection source"

            # Look for a Forecepoint ThreatSeeker category of elevated exposure
            # Look for a Forecepoint ThreatSeeker category of phishing and other frauds
            # Look for a Forecepoint ThreatSeeker category of suspicious content
            vt_result |= vt_data.get('Forcepoint ThreatSeeker category') in threats

            # save the file name and the result in your dict
            results[vt_result_file] = str(vt_result)

    # return the dictionary
    return results

然后您可以按如下所示打印结果：

if __name__ == '__main__':
    result_dict = vt_result_check(path)
    with open(file_n, 'w') as output:
        for i in result_dict:
            output.write(i + ": " + result_dict[i])

尝试将文件名写入CSV

2 个答案: