我已经在这里检查了所有建议的答案,但是我尝试的每个答案我都认为我会破坏代码。
我的数据示例:
print(transactions.head())
loc amount local_date
0 RAIL 8.1 2016-09-30
1 LINK NETWORK LIMIT 4.0 2016-10-02
2 CHOCOLATE CAFE 3.0 2016-10-03
3 Four Star Pizza 9.7 2016-10-03
4 Cinema 10.0 2016-10-04
我只想按年份,月份和交易总额进行分组。
例如我的预期结果:
2019 Jan 100
Feb 123
Mar 150
etc.
2018 Jan 200
Feb 150
Mar 211
etc.
我尝试过的(基本上是所有建议的答案)
transactions.set_index('local_date').groupby([(transactions.index.year),(transactions.index.month)])['amount'].sum()
AttributeError Traceback (most recent call last)
<ipython-input-332-64938cfdee85> in <module>
----> 1 transactions.set_index('local_date').groupby([(transactions.index.year),(transactions.index.month)])['amount'].sum()
AttributeError: 'RangeIndex' object has no attribute 'year'
transactions.set_index('local_date').groupby([(transactions.index.dt.year),(transactions.index.dt.month)])['amount'].sum()
AttributeError Traceback (most recent call last)
<ipython-input-334-150c05241676> in <module>
----> 1 transactions.set_index('local_date').groupby([(transactions.index.dt.year),(transactions.index.dt.month)])['amount'].sum()
AttributeError: 'RangeIndex' object has no attribute 'dt'
transactions.set_index('local_date').groupby([(transactions.index.to_series().dt.year),(transactions.index.to_series.()dt.month)])['amount'].sum()
AttributeError: Can only use .dt accessor with datetimelike values
我迷失了方向。我要去哪里错了?
答案 0 :(得分:2)
使用Series.dt
+ DataFrame.groupby
:
df['local_date']=pd.to_datetime(df['local_date'])
df.groupby([df['local_date'].dt.year,df['local_date'].dt.month])['amount'].sum()
local_date local_date
2016 9 8.1
10 26.7
如果愿意,请显示月份名称:
new_df=df.groupby([df['local_date'].dt.year,df['local_date'].dt.month_name()])['amount'].sum().to_frame('Total amount')
print(new_df)
Total amount
local_date local_date
2016 October 26.7
September 8.1
new_df=df.groupby(df['local_date'].dt.to_period('M')).amount.sum().to_frame('Total_amount')
print(new_df)
Total_amount
local_date
2016-09 8.1
2016-10 26.7
答案 1 :(得分:2)
您可以按year, month
分组:
(transaction.groupby([transaction.local_date.dt.year,
transaction.local_date.dt.month])
.sum())
输出:
amount
local_date local_date
2016 9 8.1
10 26.7
如果要使用月份名称,请将.dt.month
替换为.dt.month_name()
,但是您需要做更多的工作才能正确排序:
amount
local_date local_date
2016 October 26.7
September 8.1