熊猫:使用groupie的平行图

时间:2018-03-28 11:12:40

标签: python pandas dataframe pandas-groupby

我想知道是否有人可以帮助我进行平行坐标绘图。

首先,这是数据的样子:

Data它的数据来自:https://data.cityofnewyork.us/Transportation/2016-Yellow-Taxi-Trip-Data/k67s-dv2t

因此,我正在尝试规范化某些功能,并使用它来计算一周中每一天的行程距离,乘客数量和付款金额的平均值。

from pandas.tools.plotting import parallel_coordinates

feature = ['trip_distance','passenger_count','payment_amount']

#normalizing data
for feature in features:
     df[feature] = (df[feature]-df[feature].min())/(df[feature].max()-df[feature].min())

#change format to datetime
pickup_time = pd.to_datetime(df['pickup_datetime'], format ='%d/%m/%y %H:%M')
#fill dayofweek column with 0~6 0:Monday and 6:Sunday
df['dayofweek'] = pickup_time.dt.weekday

mean_trip = df.groupby('dayofweek').trip_distance.mean()
mean_passanger = df.groupby('dayofweek').passenger_count.mean()
mean_payment = df.groupby('dayofweek').payment_amount.mean()

#parallel_coordinates('notsurewattoput')

所以,如果我打印mean_trip:

enter image description here

它显示了一周中每一天的平均值,但我不确定如何使用它在同一个图上用所有3种方法绘制平行坐标图。

有谁知道如何实现这个?

1 个答案:

答案 0 :(得分:1)

我认为您可以将3次聚合均值更改为输出DataFrame而不是3系列:

mean_trip = df.groupby('dayofweek').trip_distance.mean()
mean_passanger = df.groupby('dayofweek').passenger_count.mean()
mean_payment = df.groupby('dayofweek').payment_amount.mean()

为:

from pandas.tools.plotting import parallel_coordinates

cols = ['trip_distance','passenger_count','payment_amount']
df1 = df.groupby('dayofweek', as_index=False)[cols].mean()
#https://stackoverflow.com/a/45082022
parallel_coordinates(df1, class_column='dayofweek', cols=cols)