我有一个3列的数据框df。 A,B和C。我想创建一个加权平均列,但要测试不同的权重(权重必须等于100%)。
所以我可以做到;
weights =np.arange(0,1,0.05)
if i+j+k=1:
for i in weights:
for j in weights:
for k in weights:
outname=str(i)+'A'+str(j)+'B'+str(k)+'C'
df[outname]=df['A'].multiply(k)+df['B'].multiply(i)+df['C'].multiply(j)
else:
pass
但是,列数可能会更改为更大的数。因此,该方法将停止工作。
有人能看到一个聪明的方法吗?
答案 0 :(得分:1)
这就是您要寻找的东西
from random import randint
import pandas as pd
df = pd.DataFrame([[0,1,2],[3,4,5],[6,7,8]], columns=['A','B','C'])
weightpool = np.arange(0,1,0.05)
weights = np.linspace(0, 0, num=df.columns.size)
for times in range(1,3):
#all weights sum up to 1
while weights.sum()!=1:
#choose weights out of pool
for i in range(len(weights)-1):
weights[i] = weightpool[randint(0, len(weightpool)-1)]
for i in range(len(weights)-1):
outname = outname + str(weights[i]) + df.columns[i]
outvalue = df[df.columns[i]].multiply(weights[i])
df[outname] = pd.Series(outvalue, index=df.index)
df