我有以下 Pandas系列:
Asia China 19.7549
Japan 10.2328
India 14.9691
South Korea 2.27935
Iran 5.70772
North America United States 11.571
Canada 61.9454
Europe United Kingdom 10.6005
Russian Federation 17.2887
Germany 17.9015
France 17.0203
Italy 33.6672
Spain 37.9686
Australia Australia 11.8108
South America Brazil 69.648
Name: % Renewable, dtype: object
我已将该数据绑定到5个存储箱中:
binning = pd.cut(Reducedset['% Renewable'],5)
然后我想在以下每个 bins 中的每个国家中计算国家/地区数量:
df.groupby(binning)['% Renewable'].agg(['count'])
因此,最终数据框应仅以“大陆” 作为索引,而不是国家/地区。
但是,该公式不起作用。
我当前的输出是这样:
count
binning
(2.212, 15.753] 7
(15.753, 29.227] 4
(29.227, 42.701] 2
(56.174, 69.648] 2
我想在这里显示“大陆”的索引...
有人能帮我吗?
答案 0 :(得分:2)
请确保您不会犯愚蠢的错误,例如为数据框使用错误的名称:
Reducedset.groupby(binning)['% Renewable'].agg(['count'])
答案 1 :(得分:1)
据我了解,您有:
因为稍后将需要对单个行进行分箱,即使在某些情况下 更改索引,最好将 binning 另存为另一列:
Reducedset['binning'] = pd.cut(Reducedset['% Renewable'], 5)
结果是:
% Renewable binning
continents countries
Asia China 19.75490 (15.753, 29.227]
Japan 10.23280 (2.212, 15.753]
India 14.96910 (2.212, 15.753]
South Korea 2.27935 (2.212, 15.753]
Iran 5.70772 (2.212, 15.753]
North America United States 11.57100 (2.212, 15.753]
Canada 61.94540 (56.174, 69.648]
Europe United Kingdom 10.60050 (2.212, 15.753]
Russian Federation 17.28870 (15.753, 29.227]
Germany 17.90150 (15.753, 29.227]
France 17.02030 (15.753, 29.227]
Italy 33.66720 (29.227, 42.701]
Spain 37.96860 (29.227, 42.701]
Australia Australia 11.81080 (2.212, 15.753]
South America Brazil 69.64800 (56.174, 69.648]
如果您只希望索引中有大陆,则可以运行:
Reducedset.reset_index('countries', inplace=True)
您可以打印它,并按 binning 排序,结果是:
countries % Renewable binning
continents
Asia Japan 10.23280 (2.212, 15.753]
Asia India 14.96910 (2.212, 15.753]
Asia South Korea 2.27935 (2.212, 15.753]
Asia Iran 5.70772 (2.212, 15.753]
North America United States 11.57100 (2.212, 15.753]
Europe United Kingdom 10.60050 (2.212, 15.753]
Australia Australia 11.81080 (2.212, 15.753]
Asia China 19.75490 (15.753, 29.227]
Europe Russian Federation 17.28870 (15.753, 29.227]
Europe Germany 17.90150 (15.753, 29.227]
Europe France 17.02030 (15.753, 29.227]
Europe Italy 33.66720 (29.227, 42.701]
Europe Spain 37.96860 (29.227, 42.701]
North America Canada 61.94540 (56.174, 69.648]
South America Brazil 69.64800 (56.174, 69.648]
如您所见,在(2.212,15.753] bin中,您有来自 4 个大洲,因此仍需要有关国家的信息 (尽管您可以将其作为“常规”列)。
现在,您也可以执行聚合,但需要稍作更改:
Reducedset.groupby('binning')['% Renewable'].agg(['count'])
(请注意 Reducedset 而不是 df 以及 binning 周围的撇号, 因为它现在已成为DataFrame中的列。