我一直希望提供基于Year_Month和各种指标(例如交易量和完成交易数)的逐年报告。以下允许对更大的数据集进行适当的格式化。
import pandas as pd
import numpy as np
dfTest = [
('Client', ['A','A','A','A',
'B','B','B','B',
'C','C','C','C',
'D','D','D','D']),
('Year_Month', ['2018-08', '2018-08', '2018-10','2018-11',
'2018-08', '2018-08', '2018-10','2018-11',
'2018-08', '2018-08', '2018-10', '2018-11',
'2018-08', '2018-08', '2018-10', '2018-11']),
('Volume', [100, 200, 300,400,
1, 2, 3,4,
10, 20, 30,40,
1000, 2000, 3000,4000]
),
('state', ['Done', 'Tied Done', 'Tied Done','Done',
'Passed', 'Done', 'Passed', 'Done',
'Rejected', 'Done', 'Passed', 'Done',
'Done', 'Done', 'Done', 'Done']
)
]
df = pd.DataFrame.from_items(dfTest)
print(df)
样本数据
Client Year_Month Volume state
0 A 2018-08 100 Done
1 A 2018-08 200 Tied Done
2 A 2018-10 300 Tied Done
3 A 2018-11 400 Done
4 B 2018-08 1 Passed
5 B 2018-08 2 Done
6 B 2018-10 3 Passed
7 B 2018-11 4 Done
8 C 2018-08 10 Rejected
9 C 2018-08 20 Done
10 C 2018-10 30 Passed
11 C 2018-11 40 Done
12 D 2018-08 1000 Done
13 D 2018-08 2000 Done
14 D 2018-10 3000 Done
15 D 2018-11 4000 Done
答案 0 :(得分:0)
根据客户将df插入较小的df
d = dict(tuple(df.groupby('Client')))
print(d)
print("")
# Print each split df
for i in d.values():
print(i, '\n')
print("")
根据Year_Month和数量透视每个df
for i in d.values():
volume = pd.pivot_table(data=i,
values='Volume',
index=['Client'],
columns=['Year_Month'],
aggfunc= sum
).reset_index().fillna(0)
print(volume, '\n')
print("")
Year_Month Client 2018-08 2018-10 2018-11
0 A 300 300 400
Year_Month Client 2018-08 2018-10 2018-11
0 B 3 3 4
Year_Month Client 2018-08 2018-10 2018-11
0 C 30 30 40
Year_Month Client 2018-08 2018-10 2018-11
0 D 3000 3000 4000
根据Year_Month和交易次数透视每个df
for i in d.values():
count = pd.pivot_table(data=i,
values='Volume',
index=['Client'],
columns=['Year_Month'],
aggfunc= np.count_nonzero
).reset_index().fillna(0)
print(count, '\n')
Year_Month Client 2018-08 2018-10 2018-11
0 A 2 1 1
Year_Month Client 2018-08 2018-10 2018-11
0 B 2 1 1
Year_Month Client 2018-08 2018-10 2018-11
0 C 2 1 1
Year_Month Client 2018-08 2018-10 2018-11
0 D 2 1 1