大熊猫:GROUPBY( 'date_x')[ '结果']意味着()。

时间:2017-09-25 04:50:57

标签: pandas pandas-groupby

https://www.kaggle.com/anokas/time-travel-eda

这些代码究竟是什么意思?groupby('date_x')['outcome'].mean(),我在sklearn doc中找不到这个。

date_x['Class probability'] = df_train.groupby('date_x')['outcome'].mean()
date_x['Frequency'] = df_train.groupby('date_x')['outcome'].size()
date_x.plot( secondary_y='Frequency',figsize=(22, 10))

谢谢!

1 个答案:

答案 0 :(得分:1)

我认为更好的方法是使用DataFrameGroupBy.aggsize分组,mean分组,date_x按群组d = {'mean':'Class probability','size':'Frequency'} df = df_train.groupby('date_x')['outcome'].agg(['mean','size']).rename(columns=d) df.plot( secondary_y='Frequency',figsize=(22, 10)) 分组:

d = {'date_x':pd.to_datetime(['2015-01-01','2015-01-01','2015-01-01',
                              '2015-01-02','2015-01-02']),
     'outcome':[20,30,40,50,60]}
df_train = pd.DataFrame(d)
print (df_train)
      date_x  outcome
0 2015-01-01       20 ->1.group
1 2015-01-01       30 ->1.group
2 2015-01-01       40 ->1.group
3 2015-01-02       50 ->2.group
4 2015-01-02       60 ->2.group

d = {'mean':'Class probability','size':'Frequency'}
df = df_train.groupby('date_x')['outcome'].agg(['mean','size']).rename(columns=d)
print (df)
            Class probability  Frequency
date_x                                  
2015-01-01                 30          3
2015-01-02                 55          2

有关详细信息,请查看applying multiple functions at once

样品:

int position=-1;

position= Arrays.asList(array).indexOf("27/September/2017");
String name , image;
if(position>=0){
name = productName [position];
image = images [position];
}
相关问题