我是Python的新手,需要一个如何以更快的方式迭代数据框项目的建议? 我的实施:
weights = histCaps.copy()
for index, row in histCaps.iterrows():
for column, item in row.iteritems():
weights[column].loc[index] = item/row.sum()
答案 0 :(得分:2)
不要循环,更好的是使用带有div
的矢量化sum
以获得更好的效果:
histCaps = pd.DataFrame({
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0],
'E':[5,3,6,9,2,4],
})
weights = histCaps.div(histCaps.sum(axis=1),axis=0)
print (weights)
B C D E
0 0.235294 0.411765 0.058824 0.294118
1 0.263158 0.421053 0.157895 0.157895
2 0.166667 0.375000 0.208333 0.250000
3 0.200000 0.160000 0.280000 0.360000
4 0.500000 0.200000 0.100000 0.200000
5 0.363636 0.272727 0.000000 0.363636
<强>详细强>:
print (histCaps.sum(axis=1))
0 17
1 19
2 24
3 25
4 10
5 11
dtype: int64