我有以下熊猫系列:
Reducedset['% Renewable']
哪个给我:
Asia China 19.7549
Japan 10.2328
India 14.9691
South Korea 2.27935
Iran 5.70772
North America United States 11.571
Canada 61.9454
Europe United Kingdom 10.6005
Russian Federation 17.2887
Germany 17.9015
France 17.0203
Italy 33.6672
Spain 37.9686
Australia Australia 11.8108
South America Brazil 69.648
Name: % Renewable, dtype: object
然后我将该系列分类为5个容器:
binning = pd.cut(Top15['% Renewable'],5)
哪个给我:
Asia China (15.753, 29.227]
Japan (2.212, 15.753]
India (2.212, 15.753]
South Korea (2.212, 15.753]
Iran (2.212, 15.753]
North America United States (2.212, 15.753]
Canada (56.174, 69.648]
Europe United Kingdom (2.212, 15.753]
Russian Federation (15.753, 29.227]
Germany (15.753, 29.227]
France (15.753, 29.227]
Italy (29.227, 42.701]
Spain (29.227, 42.701]
Australia Australia (2.212, 15.753]
South America Brazil (56.174, 69.648]
Name: % Renewable, dtype: category
Categories (5, interval[float64]): [(2.212, 15.753] < (15.753, 29.227] < (29.227, 42.701] <
(42.701, 56.174] < (56.174, 69.648]]
然后我将这些分类的数据进行分组,以便计算每个分类中的国家/地区数量:
Reduced = Reducedset.groupby(binning)['% Renewable'].agg(['count'])
哪个给我:
% Renewable
(2.212, 15.753] 7
(15.753, 29.227] 4
(29.227, 42.701] 2
(42.701, 56.174] 0
(56.174, 69.648] 2
Name: count, dtype: int64
但是,索引已消失,我仍然希望保留“大洲”的索引(外部索引)。
因此,在[%Renewable]列的最左侧,应该说:
Asia
North America
Europe
Australia
South America
当我尝试通过以下方式这样做时:
print(Reducedset['% Renewable'].groupby([Reducedset['% Renewable'].index.get_level_values(0),pd.cut(Reducedset['% Renewable'],5)]).count())
它有效!
问题解决了!
答案 0 :(得分:1)
我们假设以下数据:
np.random.seed(1)
s = pd.Series(np.random.randint(0,10, 16),
index=pd.MultiIndex.from_arrays([list('aaaabbccdddddeee'),
list('abcdefghijklmnop')]))
那么,您正在寻找的IIUC是什么
print(s.groupby([s.index.get_level_values(0), #that is the continent for you
pd.cut(s, 5)]) #that is the binning you created
.count())
a (-0.009, 1.8] 0
(1.8, 3.6] 0
(3.6, 5.4] 2
(5.4, 7.2] 0
(7.2, 9.0] 2
b (-0.009, 1.8] 2
(1.8, 3.6] 0
(3.6, 5.4] 0
(5.4, 7.2] 0
(7.2, 9.0] 0
c (-0.009, 1.8] 1
(1.8, 3.6] 0
(3.6, 5.4] 0
(5.4, 7.2] 1
(7.2, 9.0] 0
d (-0.009, 1.8] 0
(1.8, 3.6] 1
(3.6, 5.4] 2
(5.4, 7.2] 1
(7.2, 9.0] 1
e (-0.009, 1.8] 0
(1.8, 3.6] 2
(3.6, 5.4] 1
(5.4, 7.2] 0
(7.2, 9.0] 0
dtype: int64