我的输入如下:
NAME Geoid Year QTR Index 'Abilene, TX 10180 1978 3 0 'Abilene, TX 10180 1978 4 0 'Abilene, TX 10180 1979 1 0 'Abilene, TX 10180 1979 2 0 'Decatur, IL 19500 1998 1 110.51 'Decatur, IL 19500 1998 2 110.48 'Decatur, IL 19500 1998 3 113.01 'Decatur, IL 19500 1998 4 114.16 'Fairbanks, AK 21820 1990 1 63.74 'Fairbanks, AK 21820 1990 2 70.68 'Fairbanks, AK 21820 1990 3 83.56 'Fairbanks, AK 21820 1990 4 83.95
我想从MYSQL转换为python的查询是这样的:
SELECT geoid, name, YEAR, AVG(index)
FROM table_1
WHERE geoid>0
GROUP BY geoid, metro_name, YEAR;
AVG的pythonic等价物是我在线阅读的意思,但是当我使用它时,它给了我一个单独的价值。
pandas get column average/mean
但我希望输出分组的年份和季度如下:
Name Geoid YEAR AVG(index) 'Abilene, TX 10180 1978 0 'Abilene, TX 10180 1979 0 'Decatur, IL 19500 1998 111.75 'Fairbanks, AK 21820 1990 74.9875
如何实现这一目标?
答案 0 :(得分:3)
首先使用query
或boolean indexing
进行过滤,然后使用汇总mean
进行groupby
:
df1 = df.query('Geoid > 0').groupby(['NAME','Geoid','Year'], as_index=False)['Index'].mean()
print (df1)
NAME Geoid Year Index
0 'Abilene, TX 10180 1978 0.0000
1 'Abilene, TX 10180 1979 0.0000
2 'Decatur, IL 19500 1998 112.0400
3 'Fairbanks, AK 21820 1990 75.4825
df1 = df[df['Geoid'] > 0].groupby(['NAME','Geoid','Year'], as_index=False)['Index'].mean()
print (df1)
NAME Geoid Year Index
0 'Abilene, TX 10180 1978 0.0000
1 'Abilene, TX 10180 1979 0.0000
2 'Decatur, IL 19500 1998 112.0400
3 'Fairbanks, AK 21820 1990 75.4825