我有这个数据农场:
df = pd.DataFrame(
{'cn':[1,1,1,1,2,2,2], 'date': ['01/10/2017', '02/09/2016', '02/10/2016','01/20/2017', '05/15/2017', '02/10/2016', '02/10/2018'],
'score':[4,10,6, 5, 15, 7, 8]})
cn date score
0 1 01/10/2017 4
1 1 02/09/2016 10
2 1 02/10/2016 6
3 1 01/20/2017 5
4 2 05/15/2017 15
5 2 02/10/2016 7
6 2 02/10/2018 8
我有以下两个功能:
def total_count_phq9_BOF_activation (grf):
s = grf.score.count()
return s
def first_phq9_BOF_activation (grf):
value =grf[grf.score==grf.score.max()].date
return value
我使用此解决方案(1)将这两个函数用于apply方法:
df.groupby('cn').apply (lambda x: pd.Series({"first_phq9_BOF_activation": first_phq9_BOF_activation , "total_count_phq9_BOF_activation": total_count_phq9_BOF_activation}))
但是没有用。您能否让我知道我的代码的哪一部分是错误的?
答案 0 :(得分:0)
您没有在Series构造函数中调用函数total_count_phq9_BOF_activation
和first_phq9_BOF_activation
。它们不是apply
的一部分。它们属于系列构造函数,因此您需要专门用(x)
df.groupby('cn').apply (lambda x: pd.Series({"first_phq9_BOF_activation": first_phq9_BOF_activation(x) ,
"total_count_phq9_BOF_activation": total_count_phq9_BOF_activation(x)}))
Out[157]:
first_phq9_BOF_activation total_count_phq9_BOF_activation
cn
1 1 02/09/2016
Name: date, dtype: object 4
2 4 05/15/2017
Name: date, dtype: object 3