Question

我有一系列的词汇。我不知道这个列表中会有多少个词组，因为结果因数据而异。我必须通过那些值来找到共性。一旦我找到了常见的东西，那么我必须合并那些具有相同值的dicts并计算出这个值的频率。

这是样本数据。

[
 { 
   "id": 100
   "category": null,
   "mid": null
 },
 {
   "id": 100
   "city": "roma"
  },
  { 
   "id": 100
   "category": null,
   "mid": null
 },
 {
   "id": 100
   "city": "roma"
  },
 {
   "id": 200
   "category": "red",
   "mid": null
  },
  {
   "id": 200
   "region": "toscany"
  },
 {
   "id": 300
   "category": "blue",
   "mid": "cold",
   "sub": null
  },
 {
   "id": 400
   "category": "yellow",
   "mid": "warm"
  },
 {
   "id": 400
   "city": "milano"
  }
 ]

，预期结果应该是这样的。

[
 { 
   "id": 100
   "category": null,
   "mid": null,
   "city": "roma"
   "count": 2
 },
 {
   "id": 200
   "category": "red",
   "mid": null,
   "region": "toscany",
   "count": 1
  },
 {
   "id": 300
   "category": "blue",
   "mid": "cold",
   "sub": null,
   "count": 1
  },
 {
   "id": 400
   "category": "yellow",
   "mid": "warm",
   "city": "milano",
   "count": 1
  }
 ]

我知道如何从两个词中找到共性，但不知道多个词。也许我可以使用items（）来查找相同的值和chainmap（）来合并，但直到现在仍未达到预期的结果。

编辑当我只有两个词时我做了什么。

a={ 
   "id": 100,
   "category": null,
   "mid": null
 }
 b={
   "id": 100,
   "city": "roma"
  }
def grouping_records():
    rows.sort(key=itemgetter('id'))
    for date, items in groupby(rows, key=itemgetter('id')):
        print(id)
        for i in items:
                print(' ', i)

if __name__ == "__main__":
    grouping_records()

Answer 1

对于我们中的许多人来说，groupby有点复杂，尝试这种天真的解决方案

mylist = [dict(s) for s in set(frozenset(d.items()) for d in original)] # remove dublicate dictionaries if needed
ids = set([d['id'] for d in mylist])
id_cnt = {id: {"count": ids.count(id)} for id in ids }
for d in mylist:
     id = d['id']
     id_cnt[id].update(d)
result = id_cnt.values()

Python：在多个dicts中找到共性

1 个答案: