将分组记录转换为DataFrame

时间:2018-07-25 17:14:34

标签: python python-3.x pandas

假设我有以下格式的记录列表:

transactions = {
    'Customer A': [
        {'item': 'Item A', 'cost': 1000},
        {'item': 'Item B', 'cost': 20},
        ...
    ],
    'Customer B': [
        {'item': 'Item C', 'cost': 300},
        {'item': 'Item A', 'cost': 1000},
        ...
    ],
    ...
}

我想生成一个看起来像这样的DataFrame:

CUSTOMER     ITEM     COST
Customer A   Item A   1000
Customer A   Item B   20
...
Customer B   Item C   300
Customer B   Item A   1000
...

我知道要为每个客户生成交易数据框,我只需要调用pandas.DataFrame(transactions[customer]),但这不会给我CUSTOMER列。如何为所有交易生成单个DataFrame?另外,如何添加每个客户的DataFrame并通过添加CUSTOMER列将它们粘合在一起(实际上是相反的groupby)?

2 个答案:

答案 0 :(得分:3)

IIUC

pd.Series(transactions).apply(pd.Series).stack().apply(pd.Series).reset_index(level=0)
Out[431]: 
      level_0  cost    item
0  Customer A  1000  Item A
1  Customer A    20  Item B
0  Customer B   300  Item C
1  Customer B  1000  Item A

答案 1 :(得分:0)

您可以尝试以下代码。

import numpy as np 
import pandas as pd 

transactions = {
    'Customer A': [
        {'item': 'Item A', 'cost': 1000},
        {'item': 'Item B', 'cost': 20},
    ],
    'Customer B': [
        {'item': 'Item C', 'cost': 300},
        {'item': 'Item A', 'cost': 1000},
    ]
}

indices = []
arr = []
for k, v in transactions.items():
    indices.extend([k] * len(transactions[k]))
    for d in transactions[k]:
        arr.append(list(d.values()))

df = pd.DataFrame(arr, index=indices, columns=['ITEM', 'COST'])
print(df)

输出:

              ITEM  COST
Customer A  Item A  1000
Customer A  Item B    20
Customer B  Item C   300
Customer B  Item A  1000