在python中选择许多字典的相同值

时间:2016-07-13 03:09:40

标签: python json

最近我遇到了一位面试官的问题,他给了我一个json文件,看起来像是:

   {"id:"110235","symbol":"ccl","qty":"900","available":"35500","time":"2016-05-05T08:00:00.169646Z"}
   {"id:"110235","symbol":"ccl","qty":"550","available":"16000","time":"2016-05-05T08:01:05.167356Z"}
   {"id:"110235","symbol":"ssi","qty":"1550","available":"24000","time":"2016-05-05T08:01:07.173386Z"}
   {"id:"110235","symbol":"tcl","qty":"270","available":"21340","time":"2016-05-05T08:01:15.089586Z"}
   {"id:"110235","symbol":"ccl","qty":"690","available":"57840","time":"2016-05-05T08:01:24.236786Z"}
   {"id:"110235","symbol":"tcl","qty":"740","available":"38540","time":"2016-05-05T08:01:28.145786Z"}

他希望我对具有相同available的所有symbol的值求和。

我考虑制作一组symbol并循环浏览json文件并对available的值求和,但速度很慢。

最有效的方法是什么?

3 个答案:

答案 0 :(得分:2)

这是一种简单的方法:

from collections import defaultdict

results = defaultdict(int)
for data in data_set:
   results[data['symbol']] += int(data['available'])

for symbol, total in results.iteritems():
    print('{} - {}'.format(symbol, total))

答案 1 :(得分:0)

不清楚预期的输出格式是什么,但如果它可以是字典,则以下是O(n)时间。

def sum_symbols(data):
    symbols = {}

    for row in data:
        symbol = symbols["symbol"]
        available = int(row["available"])

        if symbol in symbols:
            symbols[symbol] += available
        else:
            symbols[symbol] = available

    return symbols

sum_symbols(data)

答案 2 :(得分:0)

arr = [{"id": "110235", "symbol": "ccl", "qty": "900", "available": "35500", "time": "2016-05-05T08:00:00.169646Z"},
       {"id": "110235", "symbol": "ccl", "qty": "550", "available": "16000", "time": "2016-05-05T08:01:05.167356Z"},
       {"id": "110235", "symbol": "ssi", "qty": "1550", "available": "24000", "time": "2016-05-05T08:01:07.173386Z"},
       {"id": "110235", "symbol": "tcl", "qty": "270", "available": "21340", "time": "2016-05-05T08:01:15.089586Z"},
       {"id": "110235", "symbol": "ccl", "qty": "690", "available": "57840", "time": "2016-05-05T08:01:24.236786Z"},
       {"id": "110235", "symbol": "tcl", "qty": "740", "available": "38540", "time": "2016-05-05T08:01:28.145786Z"}]

symbols = {}
for a in arr:
    if a["symbol"] in symbols:
        symbols[a["symbol"]] += int(a["available"])
    else:
        symbols[a["symbol"]] = int(a["available"])

print symbols