数据样本,实际数据有多年。类型" Lien"或者" Lien认可"每年只能出现一次。其他类型可在一年内重复。
tax_allyears =
tax_year type amount
2013 Lien Interest 4
2014 Lien Interest 10
2014 Lien 100
2014 Lien Interest 15
2013 Lien Endorsement 200
这条线几乎可以工作,它总结了#L; Lien Interest"年份值。
by_year_interest = tax_allyears_1[tax_allyears_1['type'] == 'Lien Interest'].groupby(by=['tax_year'])['amount'].sum()
我想要的是区分具有" Lien" vs" Lien兴趣"
by_year_Lien_interest =某个功能
tax_year amount
2014 25
by_year_Lien_Endorsement_interest =某个功能
tax_year amount
2013 4
答案 0 :(得分:1)
您可以先创建两个不同的年份列表,一个是Lien
来的,另一个是Lien Endorsement
来的。然后在您的条件中使用这些唯一列表,使用Series.isin
过滤tax_allyears
DataFrame。示例 -
lienyears = tax_allyears.loc[tax_allyears['type'] == 'Lien','tax_year'].unique().tolist()
lienendorsementyears = tax_allyears.loc[tax_allyears['type'] == 'Lien Endorsement','tax_year'].unique().tolist()
by_year_lien_interest = tax_allyears[(tax_allyears['type'] == 'Lien Interest') & tax_allyears['tax_year'].isin(lienyears)].groupby('tax_year')['amount'].sum()
by_year_lien_endorsement_interest = tax_allyears[(tax_allyears['type'] == 'Lien Interest') & tax_allyears['tax_year'].isin(lienendorsementyears)].groupby('tax_year')['amount'].sum()
演示 -
In [7]: tax_allyears
Out[7]:
tax_year type amount
0 2013 Lien Interest 4
1 2014 Lien Interest 10
2 2014 Lien 100
3 2014 Lien Interest 15
4 2013 Lien Endorsement 200
In [9]: lienyears = tax_allyears.loc[tax_allyears['type'] == 'Lien','tax_year'].unique().tolist()
In [10]: lienendorsementyears = tax_allyears.loc[tax_allyears['type'] == 'Lien Endorsement','tax_year'].unique().tolist()
In [13]: by_year_lien_interest = tax_allyears[(tax_allyears['type'] == 'Lien Interest') & tax_allyears['tax_year'].isin(lienyears)].groupby('tax_year')['amount'].sum()
In [15]: by_year_lien_endorsement_interest = tax_allyears[(tax_allyears['type'] == 'Lien Interest') & tax_allyears['tax_year'].isin(lienendorsementyears)].groupby('tax_year')['amount'].sum()
In [16]: by_year_lien_interest
Out[16]:
tax_year
2014 25
Name: amount, dtype: int64
In [17]: by_year_lien_endorsement_interest
Out[17]:
tax_year
2013 4
Name: amount, dtype: int64
答案 1 :(得分:0)
如果tax_year
,type
和amount
是columns
中DataFrame
的名称,那么您可以这样做:
# Create a groupby object
name = df.groupby(['tax_year', 'type'])
# Apply the sum function to the groupby object
df = name.sum()