行计算(Python)

时间:2018-02-05 12:37:38

标签: python pandas

尝试运行以下代码以创建新列'Median Rank':

N=data2.Rank.count()
for i in data2.Rank:
    data2['Median_Rank']=i-0.3/(N+0.4)

但我得到一个0.99802的常数值。即使我的排名栏如下:

data2.Rank.head()
Out[464]: 
4131     1.0
4173     3.0
4172     3.0
4132     3.0
5335    10.0
4171    10.0
4159    10.0
5079    10.0
4115    10.0
4179    10.0
4180    10.0
4147    10.0
4181    10.0
4175    10.0
4170    10.0
4116    24.0
4129    24.0
4156    24.0
4153    24.0
4160    24.0
5358    24.0
4152    24.0

有人请指出我的代码中的错误。

2 个答案:

答案 0 :(得分:1)

您的代码未经过矢量化。使用此:

N = data2.Rank.count()
data2['Median_Rank'] = data2['Rank'] - 0.3 / (N+0.4)

您的代码无效的原因是您在每个循环中分配整个列。因此,只有最后i次迭代才会发生,data2['Median_Rank']中的值保证相同。

答案 1 :(得分:1)

This occurs because every time you make data2['Median_Rank']=i-0.3/(N+0.4) you are updating the entire column with the value calculated by the expression, the easiest way to do that actually don't need a loop:

N=data2.Rank.count()
data2['Median_Rank'] = data2.Rank-0.3/(N+0.4)

It is possible because pandas supports element-wise operations with series.

if you still want to use for loop, you will need to use .at and iterate by rows as follow:

for i, el in zip(df_filt.index,df_filt.rendimento_liquido.values):
    df_filt.at[i,'Median_Rank']=el-0.3/(N+0.4)