https://www.kaggle.com/anokas/time-travel-eda
这些代码究竟是什么意思?groupby('date_x')['outcome'].mean()
,我在sklearn doc中找不到这个。
date_x['Class probability'] = df_train.groupby('date_x')['outcome'].mean()
date_x['Frequency'] = df_train.groupby('date_x')['outcome'].size()
date_x.plot( secondary_y='Frequency',figsize=(22, 10))
谢谢!
答案 0 :(得分:1)
我认为更好的方法是使用DataFrameGroupBy.agg
按size
分组,mean
分组,date_x
按群组d = {'mean':'Class probability','size':'Frequency'}
df = df_train.groupby('date_x')['outcome'].agg(['mean','size']).rename(columns=d)
df.plot( secondary_y='Frequency',figsize=(22, 10))
分组:
d = {'date_x':pd.to_datetime(['2015-01-01','2015-01-01','2015-01-01',
'2015-01-02','2015-01-02']),
'outcome':[20,30,40,50,60]}
df_train = pd.DataFrame(d)
print (df_train)
date_x outcome
0 2015-01-01 20 ->1.group
1 2015-01-01 30 ->1.group
2 2015-01-01 40 ->1.group
3 2015-01-02 50 ->2.group
4 2015-01-02 60 ->2.group
d = {'mean':'Class probability','size':'Frequency'}
df = df_train.groupby('date_x')['outcome'].agg(['mean','size']).rename(columns=d)
print (df)
Class probability Frequency
date_x
2015-01-01 30 3
2015-01-02 55 2
有关详细信息,请查看applying multiple functions at once。
样品:
int position=-1;
position= Arrays.asList(array).indexOf("27/September/2017");
String name , image;
if(position>=0){
name = productName [position];
image = images [position];
}