Question

user

我如何从该数据帧循环并添加col1 + col2 + col3 + col4，如果不等于100，则在该索引中取值执行此col1 /（col1 + col2 + col3 + col4并使新值等于这样，当您对col1 + col2 + col3 + col4求和时，该索引将总计为100。

例如对于索引0，当您添加col1 + col2 + col3 + col4时，它等于100，因此，转到下一个索引，但是对于索引1，它加起来为99，所以取20/99并使其等于该职位的新价值，等等。

预期输出：

import pandas

d = {'col1': [25,20,30],
     'col2': [25,20,30],
     'col3': [25,20,30], 
     'col4': [25,39,11]
     }

df = pandas.DataFrame(data=d)

Answer 1

这是矢量化版本：

c = df.sum(1).ne(100)
vals = np.where(c[:,None],df.div(df.sum(1),axis=0),df)
new_df = pd.DataFrame(vals,index=df.index,columns=df.columns)
# for overwriting the original df , use: df[:] = vals
print(new_df)

       col1      col2      col3       col4
0  25.00000  25.00000  25.00000  25.000000
1   0.20202   0.20202   0.20202   0.393939
2   0.29703   0.29703   0.29703   0.108911

Answer 2

这首先通过将每一列生成为自己的列表来实现您想要的：

col = [d[row][i] for row in d]

然后应用您描述的过程：

if sum(col) != 100:
        newcol = [n/sum(col) for n in col]

，然后可以重新插入该列。最终产品：

for i in range(0, 3):
    col = [d[row][i] for row in d]
    if sum(col) != 100:
        newcol = [n/sum(col) for n in col]
    else:
        newcol = col.copy()
    for row in d:
        d[row][i] = newcol[int(row[-1:])-1]

Answer 3

我最终使用这种方法解决了我的问题

for i in range(len(df)):
    x = (df.loc[i,'col1']+df.loc[i,'col2']+df.loc[i,'col3']+df.loc[i,'col4'])
    for j in range(0,4):
        df.iloc[i,j] = (df.iloc[i,j])/(x)

汇总列并在满足特定条件的情况下替换单个值

3 个答案: