分组数据框后如何应用唯一函数和均值函数?

时间:2019-05-29 11:15:28

标签: python dataframe

我正在研究GPS轨迹。

我试图找到属于三个不同类别的车辆的平均速度。 需要每辆车的平均值。

"Vehicle ID","Frame ID","Total Frames","Global Time","Local X","Local Y","Global X","Global Y","V_Len","V_Width","V_Class","V_Vel","V_Acc","Lane_ID","Pre_Veh","Fol_Veh","Spacing","Headway"
3033,9064,633,1118847885300,42.016,377.256,6451360.093,1873080.530,19.5,8.5,2,27.90,4.29,4,3022,0,93.16,3.34
3033,9065,633,1118847885400,42.060,380.052,6451362.114,1873078.608,19.5,8.5,2,28.43,6.63,4,3022,0,93.87,3.30
3033,9066,633,1118847885500,42.122,382.924,6451364.187,1873076.613,19.5,8.5,2,29.07,6.89,4,3022,0,94.49,3.25
3033,9067,633,1118847885600,42.200,385.882,6451366.307,1873074.553,19.5,8.5,2,29.62,4.41,4,3022,0,95.04,3.21
3033,9068,633,1118847885700,42.265,388.885,6451368.490,1873072.453,19.5,8.5,2,29.93,1.57,4,3022,0,95.57,3.19


df.sort_values(by=["Global Time"])
df["US Time"]=pd.to_datetime(df["Global Time"], unit='ms').dt.tz_localize('UTC' ).dt.tz_convert('America/Los_Angeles')

#Converting gps millisecond TS to US Local Time date format

#sorting

grouped=df.groupby('V_Class')

#find mean of all vehicles in each class
print( grouped['V_Vel'].agg([np.mean,np.std]))

for index, row in df.iterrows(): 
    print (row["Vehicle ID"], row["V_Class"])

实际输出

V_Class     mean        std       
1        40.487673  14.647576
2        37.376317  14.940034
3        40.953483  11.214995

预期产量

Vehicle ID V_Class     mean        std  
3033           2           32.4       12.4
125            1           41.3       9.2
.
likewise

1 个答案:

答案 0 :(得分:0)

如果您想要每辆车的平均值,只需按车辆分组:

 df.groupby(['Vehicle ID','V_Class'])['V_Vel'].agg([np.mean, np.std])

它应该给出(连同您的示例数据):

                     mean       std
Vehicle ID V_Class                 
3033       2        28.99  0.834955