假设我有以下格式的记录列表:
transactions = {
'Customer A': [
{'item': 'Item A', 'cost': 1000},
{'item': 'Item B', 'cost': 20},
...
],
'Customer B': [
{'item': 'Item C', 'cost': 300},
{'item': 'Item A', 'cost': 1000},
...
],
...
}
我想生成一个看起来像这样的DataFrame:
CUSTOMER ITEM COST
Customer A Item A 1000
Customer A Item B 20
...
Customer B Item C 300
Customer B Item A 1000
...
我知道要为每个客户生成交易数据框,我只需要调用pandas.DataFrame(transactions[customer])
,但这不会给我CUSTOMER
列。如何为所有交易生成单个DataFrame?另外,如何添加每个客户的DataFrame并通过添加CUSTOMER
列将它们粘合在一起(实际上是相反的groupby
)?
答案 0 :(得分:3)
IIUC
pd.Series(transactions).apply(pd.Series).stack().apply(pd.Series).reset_index(level=0)
Out[431]:
level_0 cost item
0 Customer A 1000 Item A
1 Customer A 20 Item B
0 Customer B 300 Item C
1 Customer B 1000 Item A
答案 1 :(得分:0)
您可以尝试以下代码。
import numpy as np
import pandas as pd
transactions = {
'Customer A': [
{'item': 'Item A', 'cost': 1000},
{'item': 'Item B', 'cost': 20},
],
'Customer B': [
{'item': 'Item C', 'cost': 300},
{'item': 'Item A', 'cost': 1000},
]
}
indices = []
arr = []
for k, v in transactions.items():
indices.extend([k] * len(transactions[k]))
for d in transactions[k]:
arr.append(list(d.values()))
df = pd.DataFrame(arr, index=indices, columns=['ITEM', 'COST'])
print(df)
输出:
ITEM COST
Customer A Item A 1000
Customer A Item B 20
Customer B Item C 300
Customer B Item A 1000