Question

当问题解释了问题时，我一直在尝试生成嵌套的JSON对象。在这种情况下，我有for个循环从字典dic中获取数据。以下是代码：

f = open("test_json.txt", 'w')
flag = False
temp = ""
start = "{\n\t\"filename\"" + " : \"" +initial_filename+"\",\n\t\"data\"" +" : " +" [\n"
end = "\n\t]" +"\n}"
f.write(start)
for i, (key,value) in enumerate(dic.iteritems()):
    f.write("{\n\t\"keyword\":"+"\""+str(key)+"\""+",\n")
    f.write("\"term_freq\":"+str(len(value))+",\n")
    f.write("\"lists\":[\n\t")
    for item in value:
        f.write("{\n")
        f.write("\t\t\"occurance\" :"+str(item)+"\n")
        #Check last object
        if value.index(item)+1 == len(value):
            f.write("}\n" 
            f.write("]\n")
        else:
            f.write("},") # close occurrence object
    # Check last item in dic
    if i == len(dic)-1:
        flag = True
    if(flag):
        f.write("}")
    else:
        f.write("},") #close lists object
        flag = False 

#check for flag
f.write("]") #close lists array 
f.write("}")

预期输出为：

{
"filename": "abc.pdf",
"data": [{
    "keyword": "irritation",
    "term_freq": 5,
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    }]
}, {
    "keyword": "bomber",
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    }],
    "term_freq": 5
}]
}

但目前我得到的输出如下：

{
"filename": "abc.pdf",
"data": [{
    "keyword": "irritation",
    "term_freq": 5,
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    },]                // Here lies the problem "," before array(last element)
}, {
    "keyword": "bomber",
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    },],                  // Here lies the problem "," before array(last element)
    "term_freq": 5
}]
}

请帮助，我试图解决它，但失败了。请不要将其标记为重复，因为我已经检查过其他答案并且根本没有帮助。

编辑1： 输入基本上来自字典dic，其映射类型为<String, List> 例如：＆＃34;刺激＆＃34; =＆GT; [1,3,5,7,8] 激怒是关键，并映射到页码列表。这基本上是在外部for循环中读取的，其中key是关键字，value是该关键字出现的页面列表。

编辑2：

dic = collections.defaultdict(list) # declaring the variable dictionary
dic[key].append(value) # inserting the values - useless to tell here
for key in dic:
    # Here dic[x] represents list - each value of x
    print key,":",dic[x],"\n" #prints the data in dictionary

Answer 1

@ andrea-f对我来说很好，这是另一种解决方案：

随意选择：）

import json

dic = {
        "bomber": [1, 2, 3, 4, 5],
        "irritation": [1, 3, 5, 7, 8]
      }

filename = "abc.pdf"

json_dict = {}
data = []

for k, v in dic.iteritems():
  tmp_dict = {}
  tmp_dict["keyword"] = k
  tmp_dict["term_freq"] = len(v)
  tmp_dict["lists"] = [{"occurrance": i} for i in v]
  data.append(tmp_dict)

json_dict["filename"] = filename
json_dict["data"] = data

with open("abc.json", "w") as outfile:
    json.dump(json_dict, outfile, indent=4, sort_keys=True)

这是同样的想法，我首先创建一个直接在json中保存的大json_dict。我使用with语句来保存json，避免捕获exception

此外，如果您的json输出需要进一步改进，则应查看json.dumps()的文档。

修改

只是为了好玩，如果你不喜欢tmp var，你可以在一行中完成所有数据for循环：）

json_dict["data"] = [{"keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v]} for k, v in dic.iteritems()]

它可以为最终解决方案提供一些不完全可读的东西：

import json json_dict = { "filename": "abc.pdf", "data": [{ "keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v] } for k, v in dic.iteritems()] } with open("abc.json", "w") as outfile: json.dump(json_dict, outfile, indent=4, sort_keys=True)

编辑2

您好像不想将json保存为所需的输出，但可以读取它。

事实上，您也可以使用json.dumps()来打印您的json。

with open('abc.json', 'r') as handle: new_json_dict = json.load(handle) print json.dumps(json_dict, indent=4, sort_keys=True)

此处仍然存在一个问题，"filename":打印在列表的末尾，因为d的{{1}}位于data之前。

要强制执行订单，您必须在dict的生成中使用OrderedDict。请注意语法是丑陋的（imo）与f

这是新的完整解决方案;）

python 2.X

将输出：

import json from collections import OrderedDict dic = { 'bomber': [1, 2, 3, 4, 5], 'irritation': [1, 3, 5, 7, 8] } json_dict = OrderedDict([ ('filename', 'abc.pdf'), ('data', [ OrderedDict([ ('keyword', k), ('term_freq', len(v)), ('lists', [{'occurrance': i} for i in v]) ]) for k, v in dic.iteritems()]) ]) with open('abc.json', 'w') as outfile: json.dump(json_dict, outfile) # Now to read the orderer json file with open('abc.json', 'r') as handle: new_json_dict = json.load(handle, object_pairs_hook=OrderedDict) print json.dumps(json_dict, indent=4)

但请注意，大多数情况下，最好保存常规 { "filename": "abc.pdf", "data": [ { "keyword": "bomber", "term_freq": 5, "lists": [ { "occurrance": 1 }, { "occurrance": 2 }, { "occurrance": 3 }, { "occurrance": 4 }, { "occurrance": 5 } ] }, { "keyword": "irritation", "term_freq": 5, "lists": [ { "occurrance": 1 }, { "occurrance": 3 }, { "occurrance": 5 }, { "occurrance": 7 }, { "occurrance": 8 } ] } ] }文件，以便成为跨语言。

Answer 2

您当前的代码无效，因为循环遍历前一个项目添加},然后当循环再次运行时它将标志设置为false，但是上次运行它时添加了{{ 1}}因为它认为会有另一个元素。

如果这是你的词典：,那么你可以这样做：

a = {"bomber":[1,2,3,4,5]}

然后通过以下方式保存数据：

import json
file_name = "a_file.json"
file_name_input = "abc.pdf"
new_output = {}
new_output["filename"] = file_name_input

new_data = []
i = 0
for key, val in a.iteritems():
   new_data.append({"keyword":key, "lists":[], "term_freq":len(val)})
   for p in val:
       new_data[i]["lists"].append({"occurrance":p})
   i += 1

new_output['data'] = new_data

生成动态嵌套JSON对象和数组 - python

2 个答案: