我有一个python脚本,在其中尝试读取目录中的所有.txt文件,并确定它们是否针对我脚本中的任何条件返回True或False。我有成千上万个.txt格式的文本文件。但是,我收到一条错误消息,指出无效的.json格式。我检查过我的文本文件是否为.json格式。我希望脚本确定.txt文件是否与以下代码中的任何语句匹配。然后,我想将结果输出到一个csv文件。非常感激你的帮助!我包含了错误消息和示例.txt文件。
具有.json格式的.txt文件示例
{
"domain_siblings": [
"try.wisebuygroup.com.au",
"www.wisebuygroup.com.au"
],
"resolutions": [
{
"ip_address": "34.238.73.135",
"last_resolved": "2018-04-22 17:59:05"
},
{
"ip_address": "52.0.100.49",
"last_resolved": "2018-06-24 17:05:06"
},
{
"ip_address": "52.204.226.220",
"last_resolved": "2018-04-22 17:59:06"
},
{
"ip_address": "52.22.224.230",
"last_resolved": "2018-06-24 17:05:06"
}
],
"response_code": 1,
"verbose_msg": "Domain found in dataset",
"whois": null
}
错误消息
line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
代码
import os
import json
import csv
path=r'./output/'
csvpath='C:/Users/xxx/Documents/csvtest'
file_n = 'file.csv'
def vt_result_check(path):
vt_result = False
for filename in os.listdir(path):
with open(path + filename, 'r') as vt_result_file:
vt_data = json.load(vt_result_file)
# Look for any positive detected referrer samples
# Look for any positive detected communicating samples
# Look for any positive detected downloaded samples
# Look for any positive detected URLs
sample_types = ('detected_referrer_samples', 'detected_communicating_samples',
'detected_downloaded_samples', 'detected_urls')
vt_result |= any(sample['positives'] > 0 for sample_type in sample_types
for sample in vt_data.get(sample_type, []))
# Look for a Dr. Web category of known infection source
vt_result |= vt_data.get('Dr.Web category') == "known infection source"
# Look for a Forecepoint ThreatSeeker category of elevated exposure
# Look for a Forecepoint ThreatSeeker category of phishing and other frauds
# Look for a Forecepoint ThreatSeeker category of suspicious content
threats = ("elevated exposure", "phishing and other frauds", "suspicious content")
vt_result |= vt_data.get('Forcepoint ThreatSeeker category') in threats
return str(vt_result)
if __name__ == '__main__':
with open(file_n, 'w') as output:
for i in range(vt_result_file):
output.write(vt_result_file, vt_result_check(path))
答案 0 :(得分:1)
您正在尝试从空文件(大小为0
)解码JSON。检查您的文件路径和该文件的内容。
注意:您在问题中提供的示例是有效的JSON,应该可以毫无问题地加载。
答案 1 :(得分:0)
您没有打开文件...
for filename in os.listdir(path):
with open(path + filename, 'r') as vt_result_file:
vt_data = json.load(vt_result_file)
listdir-列出路径中的所有 dirs 和文件。
答案 2 :(得分:0)
我建议(1)将脚本限制为仅解析.txt
文件,以及(2)以try
/ except
语句的形式添加一些基本的错误检查以捕获任何确实发生JSON错误。像这样:
def vt_result_check(path):
vt_result = False
for file in os.listdir(path):
if not file.endswith(".txt"): # skip anything that doesn't end in .txt
continue
with open(path + file, 'r') as vt_result_file:
try:
vt_data = json.load(vt_result_file)
# do whatever you want with the json data
except Exception:
print("Could not parse JSON file " + file)
您可以围绕此填写其余代码。