我正在尝试将数据按年份分组,并根据其所属年份对支出进行汇总。
以下是示例数据:
date: spend_amt:
2/1/2014 10000
2/5/2014 98
1/2/2015 5834.2
7/8/2017 561236
9/3/2017 568
28/1/2016 989895.3
我当前的代码
def yearlySpending(self):
dfspendingYearly = pd.DataFrame()
dfspendingYearly = self.dfGov.groupby(["date"])['spend_amt'].agg('sum')
dfspendingYearly.groupby(dfspendingYearly["date"].dt.year)['spend_amt'].agg(['sum'])
我遇到一个错误,'KeyError:'date'
所需的输出
date: spend_amt:
2014 10098
2015 5834.2
2016 989895.3
2017 561804
答案 0 :(得分:0)
将datestamp列转换为日期时间,并按年份对数据框进行分组
可能重复 grouping by year
df["date:"] = pd.to_datetime(df['date:'])
df.groupby(df['date:'].dt.year).sum().reset_index()
出局:
date: spend_amt:
0 2014 10098.0
1 2015 5834.2
2 2016 989895.3
3 2017 561804.0
答案 1 :(得分:0)
您的错误意味着没有列# attach packages to the search path, installing them from CRAN or GitHub if needed
librarian::shelf(plyr, tidyverse, knitr, ggplot2, scales, sqldf)
# List of all loaded packages
# (.packages())
librarian:::check_attached()
# unload
librarian::unshelf(plyr, tidyverse, knitr, ggplot2, scales, reshape2, also_depends = TRUE)
# print(.Last.value)
,我猜有date
叫做index
:
date