Question

list = [
    {'status': u'Purchase', 'phantom': False, 'row_no': 1, 'product_id': 25872, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 2, 'product_id': 25872, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 3, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 4, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 5, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0}  
]

我有此列表，其中有字典，如您所见，有两行带有product_id 25872，而树行带有product_id 25875。

如何查看列表中的所有字典，并用字典创建相同的列表，但每个产品只有1行？并且'qty'应该总结。

所以从这个列表中，我想得到类似的输出

list = [
    {'status': u'Purchase', 'phantom': False, 'row_no': 1, 'product_id': 25872, 'standard_price': 14.0, 'qty': 2.0, 'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 2, 'product_id': 25875, 'standard_price': 14.0, 'qty': 3.0, 'cost': 14.0},
]

Answer 1

在itertools.groupby和python的sum理解的帮助下，我认为list就足够了。试试这个：

from itertools import groupby

lst = [
        {'status': u'Purchase', 'phantom': False, 'row_no': 1, 'product_id': 25872, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
        {'status': u'Purchase', 'phantom': False, 'row_no': 2, 'product_id': 25872, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
        {'status': u'Purchase', 'phantom': False, 'row_no': 3, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
        {'status': u'Purchase', 'phantom': False, 'row_no': 4, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
        {'status': u'Purchase', 'phantom': False, 'row_no': 5, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0}  
]

# Sort the list first, by `product_id`
lst = sorted(lst, key=lambda x:x['product_id'])

# This is where we will store our unique rows
agg = []
row_count = 1

for k,v in groupby(lst,key=lambda x:x['product_id']):
        as_list = list(v)
        as_list[0].update({
                'qty': sum([row['qty'] for row in as_list]),
                'row_no': row_count
        })
        agg.append(as_list[0])
        row_count += 1

# Print the final result
print(agg)

注意：请不要使用list作为变量名。

Answer 2

您可以使用product_id作为键来创建词典字典，以使条目唯一。然后从该分组字典中获取.values（）。要添加数量，请浏览合并的条目，然后使用列表中相应值的总和更新“数量”条目。行号相同（如果需要）。

list1 = [
    {'status': u'Purchase', 'phantom': False, 'row_no': 1, 'product_id': 25872, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 2, 'product_id': 25872, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 3, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 4, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 5, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0, 'cost': 14.0}  
]

pid   = "product_id"
merged = {d[pid]:d for d in list1}.values()
merged = [{**m,"qty":sum(ld["qty"] for ld in list1 if ld[pid]==m[pid])} for m in merged]
merged = [{**m,"row_no":i+1} for i,m in enumerate(merged)]

print(merged)

[{'status': 'Purchase', 'phantom': False, 'row_no': 1, 'product_id': 25872, 'standard_price': 14.0, 'qty': 2.0, 'cost': 14.0},
 {'status': 'Purchase', 'phantom': False, 'row_no': 2, 'product_id': 25875, 'standard_price': 14.0, 'qty': 3.0, 'cost': 14.0}]

Answer 3

list = [
    {'status': u'Purchase', 'phantom': False, 'row_no': 1, 'product_id': 25872, 'standard_price': 14.0, 'qty': 1.0,
     'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 2, 'product_id': 25872, 'standard_price': 14.0, 'qty': 1.0,
     'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 3, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0,
     'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 4, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0,
     'cost': 14.0},
    {'status': u'Purchase', 'phantom': False, 'row_no': 5, 'product_id': 25875, 'standard_price': 14.0, 'qty': 1.0,
     'cost': 14.0}

]
import pandas as pd
df = pd.DataFrame(list)
print (df)
print (df.groupby('product_id', as_index=False)
         .agg({'status':'first','phantom':'first','row_no':'count','standard_price':'first','qty':'sum'})
         .to_dict(orient='records'))

这仍然不能解决row_no问题，所以我会尝试。

使用字典制作相同列表，但仅使用唯一记录

3 个答案: