我有一个如下所示的CSV文件:
Build,Avg,Min,Max
BuildA,56.190,39.123,60.1039
BuildX,57.11,40.102,60.200
BuildZER,55.1134,35.129404123,60.20121
我希望得到每列的平均值,最小值,最大值,并将每个统计数据作为新行。我排除非数字列(构建列),然后运行统计信息。我通过这样做完成了这个:
df = pd.read_csv('fakedata.csv')
columns = []
builds = []
for column in df.columns:
if(df[column].dtype == 'float64'):
columns.append(column)
else:
builds.append(column)
save = df[builds]
df = df[columns]
print(df)
df.loc['Min']= df.min()
df.loc['Average']= df.mean()
df.loc['Max']= df.max()
如果我当时将这些数据写入CSV,它将如下所示:
,Avg,Min,Max
0,56.19,39.123,60.1039
1,57.11,40.102,60.2
2,55.1134,35.129404123,60.20121
Min,55.1134,35.129404123,60.1039
Average,55.8817,37.3709520615,60.1522525
Max,57.11,40.102,60.20121
哪个接近我想要的但我希望Build列再次成为第一列,并且在Min,Average,Max之上存在构建名称。基本上这个:
Builds,Avg,Min,Max
BuildA,56.19,39.123,60.1039
BuildX,57.11,40.102,60.2
BuildZER,55.1134,35.129404123,60.20121
Min,55.1134,35.129404123,60.1039
Average,55.8817,37.3709520615,60.1522525
Max,57.11,40.102,60.20121
我试图通过以下方式实现这一目标:
df.insert(0,'builds', save)
with open('fakedata.csv', 'w') as f:
df.to_csv(f)
但这给了我这个CSV:
,builds,Avg,Min,Max
0,Build1,56.19,39.123,60.1039
1,Build2,57.11,40.102,60.2
2,Build3,55.1134,35.129404123,60.20121
Min,,55.1134,35.129404123,60.1039
Average,,55.8817,37.3709520615,60.1522525
Max,,57.11,40.102,60.20121
我该如何解决这个问题?
答案 0 :(得分:1)
IIUC:
df_out = pd.concat([df.set_index('Build'),df.set_index('Build').agg(['max','min','mean'])]).rename(index={'max':'Max','min':'Min','mean':'Average'}).reset_index()
输出:
index Avg Min Max
0 BuildA 56.1900 39.123000 60.10390
1 BuildX 57.1100 40.102000 60.20000
2 BuildZER 55.1134 35.129404 60.20121
3 Max 57.1100 40.102000 60.20121
4 Min 55.1134 35.129404 60.10390
5 Average 56.1378 38.118135 60.16837