所以我有一个看起来像这样的DataFrame:
year geo_name adult_obesity some_college STATE_ABBR
0 2015 Autauga County, AL 0.313 NaN AL
1 2016 Autauga County, AL 0.309 0.565 AL
2 2017 Autauga County, AL 0.341 0.597 AL
3 2013 Baldwin County, AL NaN NaN AL
4 2014 Baldwin County, AL NaN NaN AL
5 2015 Baldwin County, AL 0.250 0.625 AL
6 2016 Baldwin County, AL 0.267 0.623 AL
7 2017 Baldwin County, AL 0.274 0.629 AL
8 2015 Barbour County, AL 0.384 0.423 AL
9 2016 Barbour County, AL 0.408 0.434 AL
我想将所有这些县数据压缩成州数据并采用每年每州的平均值。
所以我想要一个具有唯一状态和年份的新数据集,并且其中原始其他行的平均值具有相同的状态和年份(adult_obesety
,some_college
)。
用pandas轻松做到这一点?