我有像这样的pandas数据框
snapDate instance waitEvent AvgWaitInMs
0 2015-Jul-03 XX gc cr block 3-way 1
1 2015-Jun-29 YY gc current block 3-way 2
2 2015-Jul-03 YY gc current block 3-way 1
3 2015-Jun-29 XX gc current block 3-way 2
4 2015-Jul-01 XX gc current block 3-way 2
5 2015-Jul-01 YY gc current block 3-way 2
6 2015-Jul-03 XX gc current block 3-way 2
7 2015-Jul-03 YY log file sync 9
8 2015-Jun-29 XX log file sync 8
9 2015-Jul-03 XX log file sync 8
10 2015-Jul-01 XX log file sync 8
11 2015-Jul-01 YY log file sync 9
12 2015-Jun-29 YY log file sync 8
我需要将其转换为
snapDate instance gc cr block 3-way gc current block 3-way log file sync
2015-Jul-03 XX 1 Na 8
2015-Jun-29 YY Na 2 8
2015-Jul-03 YY Na 1 9
...
我尝试过pivot,但是它返回了一个错误 dfWaits.pivot(index ='snapDate',columns ='waitEvent',values ='AvgWaitInMs') 索引包含重复的条目,无法重塑
结果应该是另一个dataFrame
答案 0 :(得分:1)
这是将数据帧重塑为类似于您想要的内容的一种方法。如果您对结果数据框有任何其他具体要求,请与我们联系。
import pandas as pd
# your data
# ====================================
print(df)
snapDate instance waitEvent AvgWaitInMs
0
0 2015-Jul-03 XX gc cr block 3-way 1
1 2015-Jun-29 YY gc current block 3-way 2
2 2015-Jul-03 YY gc current block 3-way 1
3 2015-Jun-29 XX gc current block 3-way 2
4 2015-Jul-01 XX gc current block 3-way 2
5 2015-Jul-01 YY gc current block 3-way 2
6 2015-Jul-03 XX gc current block 3-way 2
7 2015-Jul-03 YY log file sync 9
8 2015-Jun-29 XX log file sync 8
9 2015-Jul-03 XX log file sync 8
10 2015-Jul-01 XX log file sync 8
11 2015-Jul-01 YY log file sync 9
12 2015-Jun-29 YY log file sync 8
# processing
# ====================================
df_temp = df.set_index(['snapDate', 'instance', 'waitEvent']).unstack().fillna(0)
df_temp.columns = df_temp.columns.get_level_values(1).values
df_temp = df_temp.reset_index('instance')
print(df_temp)
instance gc cr block 3-way gc current block 3-way log file sync
snapDate
2015-Jul-01 XX 0 2 8
2015-Jul-01 YY 0 2 9
2015-Jul-03 XX 1 2 8
2015-Jul-03 YY 0 1 9
2015-Jun-29 XX 0 2 8
2015-Jun-29 YY 0 2 8