组中的组结果和计数实例 - Python

时间:2018-02-14 22:00:30

标签: python arrays dictionary aggregation

我有一个Kafka Feed,我正在解析并写入数据库。作为一个侧面信息,我需要对一组字典中的结果进行分组,并计算分组中的实例。然后,我需要将每个附加消息的结果汇总到最终结果。

到目前为止我所拥有的:

from collections import Counter

kafakmessage1 = [{'power': -145.08474576271186, 'freq': 4000000000000}, {'power': -145.38135593220343, 'freq': 4601079784043}, {'power': -146.071186440678, 'freq': 5202159568086}, {'power': -146.864406779661, 'freq': 5803239352129}, {'power': -147.73728813559322, 'freq': 6404319136172}, {'power': -147.9474576271186, 'freq': 7005398920215}, {'power': -148.71016949152542, 'freq': 7606478704259}, {'power': -149.52203389830507, 'freq': 8207558488302}]
kafakmessage2 = [{'power': -145.08474576271186, 'freq': 4000000000000}, {'power': -145.38135593220343, 'freq': 4601079784043}, {'power': -146.071186440678, 'freq': 5202159568086}, {'power': -146.864406779661, 'freq': 5803239352129}, {'power': -147.73728813559322, 'freq': 6404319136172}, {'power': -147.9474576271186, 'freq': 7005398920215}, {'power': -148.71016949152542, 'freq': 7606478704259}, {'power': -149.52203389830507, 'freq': 8207558488302}]

for d in kafakmessage1:
    freq = str(d['freq'])[:-12]
    power = int((d['power'])+100)
    occur = Counter(freq)
    print(freq, power, occur)

给出了:

4 -45 Counter({'4': 1})
4 -45 Counter({'4': 1})
5 -46 Counter({'5': 1})
5 -46 Counter({'5': 1})
6 -47 Counter({'6': 1})
7 -47 Counter({'7': 1})
7 -48 Counter({'7': 1})
8 -49 Counter({'8': 1})

我需要什么:

4 -90 2
5 -92 2
6 -47 1
7 -95 2
8 -49 1

当外部循环(不在示例中)消耗下一条消息(由kafkamessage2表示)时,结果应为:

4 -180 4
5 -184 4
6 -94 2
7 -190 4
8 -98 2

感谢您的任何见解!

1 个答案:

答案 0 :(得分:0)

以下是使用collections.defaultdict的一种解决方案。

from collections import Counter, defaultdict

kafakmessage1 = [{'power': -145.08474576271186, 'freq': 4000000000000}, {'power': -145.38135593220343, 'freq': 4601079784043}, {'power': -146.071186440678, 'freq': 5202159568086}, {'power': -146.864406779661, 'freq': 5803239352129}, {'power': -147.73728813559322, 'freq': 6404319136172}, {'power': -147.9474576271186, 'freq': 7005398920215}, {'power': -148.71016949152542, 'freq': 7606478704259}, {'power': -149.52203389830507, 'freq': 8207558488302}]
kafakmessage2 = [{'power': -145.08474576271186, 'freq': 4000000000000}, {'power': -145.38135593220343, 'freq': 4601079784043}, {'power': -146.071186440678, 'freq': 5202159568086}, {'power': -146.864406779661, 'freq': 5803239352129}, {'power': -147.73728813559322, 'freq': 6404319136172}, {'power': -147.9474576271186, 'freq': 7005398920215}, {'power': -148.71016949152542, 'freq': 7606478704259}, {'power': -149.52203389830507, 'freq': 8207558488302}]

d_power = defaultdict(int)
d_occur = defaultdict(int)

for d in kafakmessage1:
    freq = str(d['freq'])[:-12]
    power = int((d['power'])+100)
    occur = Counter(freq)
    d_power[freq] += power
    d_occur[freq] += occur[str(freq)]

for f in d_power:
    print(f, d_power[f], d_occur[f])

# 4 -90 2
# 5 -92 2
# 6 -47 1
# 7 -95 2
# 8 -49 1