import datetime
a = pd.DataFrame({'Entreprise': {0: 110, 1: 110, 2: 110, 3: 110, 4: 110},
'Etablissement': {0: 'SVR RUN',
1: 'SVR RUN',
2: 'SVR RUN',
3: 'SVR RUN',
4: 'SVR RUN'},
'Date_achat_as_date': {0: datetime.datetime(1996, 12, 15, 0, 0),
1: datetime.datetime(1996, 12, 15, 0, 0),
2: datetime.datetime(2001, 1, 17, 0, 0),
3: datetime.datetime(2001, 1, 17, 0, 0),
4: datetime.datetime(2011, 7, 1, 0, 0)},
'Valeur_Brute': {0: 2820397.61,
1: 1188910.0,
2: 245029.17,
3: 124118.68,
4: 113382.0}})
gp_by = ["Entreprise", "Etablissement", "Date_achat_as_date"]
gp = a.groupby(gp_by )["Valeur_Brute"].sum()
a.set_index(gp_by).join(gp, rsuffix="_sum_day")
但是当我访问数据库时,出现了一个错误:
gp = database.groupby(gp_by )["Valeur_Brute"].sum()
database.set_index(gp_by).join(gp, rsuffix="_sum_day")
TypeError: '<' not supported between instances of 'str' and 'int'
然后我想我在某处有一些空值,所以我做了
database[["Entreprise"]] = database[["Entreprise"]].fillna(1)
database[["Etablissement"]] = database[["Etablissement"]].fillna(1)
database = database[~database.Date_achat_as_date.isnull()]
然后:
gp = database.groupby(gp_by )["Valeur_Brute"].sum()
database.set_index(gp_by).join(gp, rsuffix="_sum_day")
但是我仍然收到错误消息:
TypeError: '<' not supported between instances of 'str' and 'int'
知道我在哪里缺少什么吗?
编辑 我用以下方法解决了这个问题:
database[gp_by] = database[gp_by].applymap(str)
但是仍然不明白为什么和如何:-/