问题尝试使用简单dataframe(可下载的csv)上的groupby,然后使用agg返回列的聚合值(大小,总和,平均值,标准偏差)。看似简单的问题就是给出了意想不到的挑战性错误。
Top15.groupby('Continent')['Pop Est'].agg(np.mean, np.std...etc)
# returns
ValueError: No axis named <function std at 0x7f16841512f0> for object type <class 'pandas.core.series.Series'>
我想要获得的是一个df,其索引设置为大陆和列['size', 'sum', 'mean', 'std']
示例代码
import pandas as pd
import numpy as np
# Create df
df = pd.DataFrame({'Country':['Australia','China','America','Germany'],'Pop Est':['123','234','345','456'],'Continent':['Asia','Asia','North America','Europe']})
# group and agg
df = df.groupby('Continent')['Pop Est'].agg('size','sum','np.mean','np.std')
答案 0 :(得分:3)
您只能汇总数字值的大小和总和,因此在您创建数据框时,请不要将数字输入为stings:
enter splat_hash with h={"a"=>4, "b"=>6}
select > 1 = {}
group_by = {}
map = []
to_h = {}
returning g = {"a"=>4, "b"=>6}
map = [["a", {"a"=>4, "b"=>6}]]
to_h = {"a"=>{"a"=>4, "b"=>6}}
returning g = {"a"=>{"a"=>4, "b"=>6}}
map = [["c", {"4"=>1, "8"=>2}], ["e", {"c"=>3, "4"=>5}],
["f", {"a"=>{"a"=>4, "b"=>6}}]]
to_h = {"c"=>{"4"=>1, "8"=>2}, "e"=>{"c"=>3, "4"=>5}, "f"=>{"a"=>{"a"=>4, "b"=>6}}}
returning g = {"c"=>{"4"=>1, "8"=>2}, "e"=>{"c"=>3, "4"=>5},
"f"=>{"a"=>{"a"=>4, "b"=>6}}, "1"=>7}
#=> {"c"=>{"4"=>1, "8"=>2},
# "e"=>{"c"=>3, "4"=>5},
# "f"=>{"a"=>{"a"=>4, "b"=>6}}, "1"=>7}
我认为这会让你得到你想要的东西吗?
df = pd.DataFrame({'Country':['Australia','China','America','Germany'],'PopEst':[123,234,345,456],'Continent':['Asia','Asia','North America','Europe']})