Pandas groupby以多个函数来应用返回错误

时间:2017-01-28 01:59:17

标签: python-3.x pandas group-by aggregate

问题尝试使用简单dataframe(可下载的csv)上的groupby,然后使用agg返回列的聚合值(大小,总和,平均值,标准偏差)。看似简单的问题就是给出了意想不到的挑战性错误。

Top15.groupby('Continent')['Pop Est'].agg(np.mean, np.std...etc)
# returns 
ValueError: No axis named <function std at 0x7f16841512f0> for object type <class 'pandas.core.series.Series'>

我想要获得的是一个df,其索引设置为大陆和列['size', 'sum', 'mean', 'std']

示例代码

import pandas as pd
import numpy as np

# Create df
df = pd.DataFrame({'Country':['Australia','China','America','Germany'],'Pop Est':['123','234','345','456'],'Continent':['Asia','Asia','North America','Europe']})

# group and agg
df = df.groupby('Continent')['Pop Est'].agg('size','sum','np.mean','np.std')

1 个答案:

答案 0 :(得分:3)

您只能汇总数字值的大小和总和,因此在您创建数据框时,请不要将数字输入为stings:

              enter splat_hash with h={"a"=>4, "b"=>6}
                select > 1 = {}
                group_by = {}
                map = []
                to_h = {}
                returning g = {"a"=>4, "b"=>6}

          map = [["a", {"a"=>4, "b"=>6}]]
          to_h = {"a"=>{"a"=>4, "b"=>6}}
          returning g = {"a"=>{"a"=>4, "b"=>6}}

    map = [["c", {"4"=>1, "8"=>2}], ["e", {"c"=>3, "4"=>5}],
           ["f", {"a"=>{"a"=>4, "b"=>6}}]]
    to_h = {"c"=>{"4"=>1, "8"=>2}, "e"=>{"c"=>3, "4"=>5}, "f"=>{"a"=>{"a"=>4, "b"=>6}}}
    returning g = {"c"=>{"4"=>1, "8"=>2}, "e"=>{"c"=>3, "4"=>5},
                   "f"=>{"a"=>{"a"=>4, "b"=>6}}, "1"=>7}
#=> {"c"=>{"4"=>1, "8"=>2},
#    "e"=>{"c"=>3, "4"=>5},
#    "f"=>{"a"=>{"a"=>4, "b"=>6}}, "1"=>7} 

我认为这会让你得到你想要的东西吗?

df = pd.DataFrame({'Country':['Australia','China','America','Germany'],'PopEst':[123,234,345,456],'Continent':['Asia','Asia','North America','Europe']})