熊猫新手,遇到一个我无法弄清楚的简单问题。
我在美国有一个婴儿名字数据集,如下所示:
我正在尝试编写一个程序,我可以在其中输入一个名单列表,并找回该名称适用于男性或女性的百分比(这一年与我的目的无关)。
我写了groupby,然后将男性和女性名字加在一起。
现在我需要的是根据这些数据计算百分比。我认为它是某种transform
(对吗?)但我似乎无法写任何有效的东西。我知道我将如何在SQL中完成它,但我真的想弄清楚Pandas。一些指示将非常感谢!
谢谢!
答案 0 :(得分:1)
如果我理解了您正在寻找的内容,我会先用零填充缺失的值,即n.fillna(0)
。然后计算百分比并将结果分配给新列。对于女性百分比:
n['%F'] = n[('Count', 'F')] / n['sum'] * 100
答案 1 :(得分:0)
甚至在你执行总和之前,你就会这样做:
n.apply(lambda x: x / x.sum(), axis=1)
答案 2 :(得分:0)
在列中看起来像Multiindex
:
print n.columns
MultiIndex(levels=[[u'Count', u'sum'], [u'', u'F', u'M']],
labels=[[0, 0, 1], [1, 2, 0]],
names=[None, u'Gender'])
首先按using-slicers选择列F
和M
。
然后按0
sum
除以idx = pd.IndexSlice
F = n.loc[:, idx['Count','F']]
M = n.loc[:, idx['Count','M']]
sum = n.loc[:, idx['sum','']]
n['%F'] = F.fillna(0)/sum * 100
n['%M'] = M.fillna(0)/sum * 100
print n
Count sum %F %M
Gender F M
Name
Aaban NaN 10.285710 10.285710 0.000000 100.000000
Aabfla 7.000000 NaN 7.000000 100.000000 0.000000
Aabid NaN 5.000000 5.000000 0.000000 100.000000
Aabrielle 5.000000 NaN 5.000000 100.000000 0.000000
Aadarn NaN 8.521739 8.521739 0.000000 100.000000
Aadan NaN 12.000000 12.000000 0.000000 100.000000
Aadar NaN 11.285710 11.285710 0.000000 100.000000
Aaden 5.000000 279.002857 284.002857 1.760546 98.239454
Aade NaN 5.000000 5.000000 0.000000 100.000000
Aadhav NaN 12.750000 12.750000 0.000000 100.000000
Aadhavan NaN 6.333333 6.333333 0.000000 100.000000
Aadhi NaN 6.000000 6.000000 0.000000 100.000000
Aadhira 0.888857 NaN 9.000007 9.876181 0.000000
Aadhve 79.875000 NaN 79.875000 100.000000 0.000000
Aadhven NaN 5.000000 5.000000 0.000000 100.000000
Aadi 5.333333 55.583333 60.910007 8.756087 91.254846
Aadian NaN 5.000000 5.000000 0.000000 100.000000
Aadil NaN 12.913003 12.913003 0.000000 100.000000
Aadin NaN 12.000000 12.000000 0.000000 100.000000
列:
13 = 8 * 1 + 5
8 = 5 * 1 + 3
5 = 3 * 1 + 2
3 = 2 * 1 + 1
2 = 1 * 2 + 0
gcd (13, 8) = 1
5 Steps required
There are only 2 different quotients: 1 and 2
It is not a very interesting example.
Instead, calculating the MCD (455, 355):
455 = 355 * 1 + 100
355 = 100 * 3 + 55
100 = 55 * 1 + 45
55 = 45 * 1 + 10
45 = 10 * 4 + 5
10 = 5 * 2 + 0
gcd (455, 355) = 5
6 steps (or lines or divisions) are required
4 different quotients: 1, 3, 4, 2
So, this case is more interesting than the last.