Question

test=[]
sites = sel.css(".info")
for site in sites:
    money = site.xpath("./h2[@class='money']/text()").extract()
    people = site.xpath("//p[@class='poeple']/text()").extract()
    test.append('{"money":'+str(money[0])+',"people":'+str(people[0])+'}')

我的结果test是：

['{"money":1,"people":23}', 
  '{"money":3,"people":21}', 
  '{"money":12,"people":82}', 
  '{"money":1,"people":54}' ]

我被两件事困住了：

一个是我打印test字符串的类型，因此不像JSON格式

两个是money值，1是重复的，所以我需要将人们加在一起，
所以我想要的最终格式是：

[
{"money":1,"people":77},
{"money":3,"people":21},
{"money":12,"people":82},
]

我该怎么做？

Answer 1

我会在dict中收集money条目并将这些人作为值加起来，json的输出确实应该使用json库完成（我没有测试过代码但它应该让你知道如何接近问题）：

money_map = {}
sites = sel.css(".info")
for site in sites:
    money = site.xpath("./h2[@class='money']/text()").extract()[0]
    people = int(site.xpath("//p[@class='poeple']/text()").extract()[0])
    if money not in money_map:
        money_map[money] = 0

    money_map[money] += people

import json
output = [{'money': key, 'people': value} for key, value in money_map.items()]
json_output = json.dumps(output)

Answer 2

基本上是这样的：

import json
foo = ['{"money":1,"people":23}',
  '{"money":3,"people":21}',
  '{"money":12,"people":82}',
  '{"money":1,"people":54}' ]


bar = []
for i in foo:
    j = json.loads(i) # string to json/dict
    # if j['money'] is not in bar:
    bar.append(j)
    # else:
    # find index of duplicate and add j['people']

以上是不完整的解决方案，您必须实施'重复检查并添加'

for循环将相同的值一起添加并生成JSON格式

2 个答案: