我正在尝试创建一个字典列表,用于对给定数据集中的常用值进行分组。数据格式如下
data = [{"CustName":"customer1", "PartNum":"part1"},
{"CustName":"customer2", "PartNum":"part2"},
{"CustName":"customer1", "PartNum":"part3"},
{"CustName":"customer2", "PartNum":"part4"}]
我想要的是
cleanedData = [
{"CustName":"customer1", "parts":[{"PartNum":"part1"}, {"PartNum":"part3"}]},
{"CustName":"customer2", "parts":[{"PartNum":"part2"}, {"PartNum":"part4"}]}]
我正在努力工作的方式需要几个循环,看起来很丑,并且感觉不是非常pythonic。我也觉得这不会很好。目前,输入数据很小 - 少于100个元素,但可能这可能是数千个元素,因此循环中的多个循环似乎效率低下。
data = [{"CustName":"customer1", "PartNum":"part1"},
{"CustName":"customer2", "PartNum":"part2"},
{"CustName":"customer1", "PartNum":"part3"},
{"CustName":"customer2", "PartNum":"part4"}]
customers = []
cleanedData = []
for d in data:
if d["CustName"] not in customers:
customers.append(d["CustName"])
for c in customers:
parts = []
for d in data:
if d["CustCode"] == c:
parts.append(d)
cust = {"CustName":c}
cust.update({"parts":parts})
cleanedData.append(cust)
有人可以提供帮助并提供更简单的方法吗?是否有内置函数可以帮助进行这种数据操作?
答案 0 :(得分:2)
您可以使用collections.defaultdict
。
d = defaultdict(list)
for item in data:
d[item['CustName']].append({'PartNum': item['PartNum']})
print(d)
如果你想在列表中找到它,可以选择跟随列表理解:
print([{'CustName': key, 'parts': value} for key, value in d.items()])