test=[]
sites = sel.css(".info")
for site in sites:
money = site.xpath("./h2[@class='money']/text()").extract()
people = site.xpath("//p[@class='poeple']/text()").extract()
test.append('{"money":'+str(money[0])+',"people":'+str(people[0])+'}')
我的结果test
是:
['{"money":1,"people":23}',
'{"money":3,"people":21}',
'{"money":12,"people":82}',
'{"money":1,"people":54}' ]
我被两件事困住了:
一个是我打印test
字符串的类型,因此不像JSON格式
两个是money
值,1是重复的,所以我需要将人们加在一起,
所以我想要的最终格式是:
[
{"money":1,"people":77},
{"money":3,"people":21},
{"money":12,"people":82},
]
我该怎么做?
答案 0 :(得分:1)
我会在dict中收集money条目并将这些人作为值加起来,json的输出确实应该使用json库完成(我没有测试过代码但它应该让你知道如何接近问题):
money_map = {}
sites = sel.css(".info")
for site in sites:
money = site.xpath("./h2[@class='money']/text()").extract()[0]
people = int(site.xpath("//p[@class='poeple']/text()").extract()[0])
if money not in money_map:
money_map[money] = 0
money_map[money] += people
import json
output = [{'money': key, 'people': value} for key, value in money_map.items()]
json_output = json.dumps(output)
答案 1 :(得分:0)
import json
foo = ['{"money":1,"people":23}',
'{"money":3,"people":21}',
'{"money":12,"people":82}',
'{"money":1,"people":54}' ]
bar = []
for i in foo:
j = json.loads(i) # string to json/dict
# if j['money'] is not in bar:
bar.append(j)
# else:
# find index of duplicate and add j['people']
以上是不完整的解决方案,您必须实施'重复检查并添加'