我必须打印百分比,但诀窍是我必须将值四舍五入到4位小数。 它位于DataFrame中,其中每列代表一次分配的百分比。
有时,百分比的总和不给1,而是0.9999或1.0001(这是有道理的)。但你怎么确定它呢? 你必须任意选择一行并将delta放入其中。 我已经提出了这个解决方案,但是我必须遍历每一列并对系列进行修改。
代码
df = abs(pd.DataFrame(np.random.randn(4, 4), columns=range(0,4)))
# Making sure the sum of allocation is 1.
df = df / df.sum()
# Rounding the allocation
df = df.round(4)
print("-- before --")
print(df)
print(df.sum())
# It can happen that after rounding your number, the sum is not equal to 1. (imagine rounding 1/3 three times...)
# So check for the sum of each col and then put the delta in in the fund with the lowest value.
for p in df:
if df[p].sum() != 1:
# get the id of the fund with the lowest percentage (but not 0)
low_id = (df[p][df[p] != 0].idxmin())
df[p][low_id] += (1 - df[p].sum())
print("-- after --")
print(df)
print(df.sum())
输出
-- before --
0 1 2 3
0 0.0116 0.1256 0.4980 0.3738
1 0.2562 0.5458 0.3086 0.1221
2 0.4853 0.0009 0.0588 0.0078
3 0.2470 0.3277 0.1346 0.4962
0 1.0001
1 1.0000
2 1.0000
3 0.9999
dtype: float64
-- after --
0 1 2 3
0 0.0115 0.1256 0.4980 0.3738
1 0.2562 0.5458 0.3086 0.1221
2 0.4853 0.0009 0.0588 0.0079
3 0.2470 0.3277 0.1346 0.4962
0 1.0
1 1.0
2 1.0
3 1.0
dtype: float64
有没有更快的解决方案?
非常感谢,
此致 于连
答案 0 :(得分:0)
避免循环总是更好。
df = abs(pd.DataFrame(np.random.randn(4, 4) ))
df = df / df.sum()
df = df.round(4)
columns = ['Sum','Min', 'submin']
dftemp = pd.DataFrame(columns=columns)
dftemp['Sum']= df.sum(axis=0) # sum columns
dftemp['Min']= df[df!=0].min(axis=0) # non zero minimum of column
dftemp['submin']= dftemp['Min']+(1-dftemp['Sum']) # (1 -sum of columns) + minimum value
dftemp['FinalValue']= np.where (dftemp['Sum']!=1,dftemp.submin,dftemp.Min) # decide weather to use existing miinimum value or delta
print('\n\nBefore \n\n ',df,'\n\n ', df.sum())
df=df.mask(df.eq(df.min(0),1),df.eq(df.min(0),1).mul(dftemp['FinalValue'].tolist())) # Replace the minmum value with delta values
print('After \n\n ',df,'\n\n ', df.sum())
输出
输出
Before
0 1 2 3
0 0.1686 0.0029 0.1055 0.1739
1 0.5721 0.5576 0.2904 0.2205
2 0.0715 0.2749 0.4404 0.5014
3 0.1878 0.1647 0.1637 0.1042
0 1.0000
1 1.0001
2 1.0000
3 1.0000
dtype: float64
After
0 1 2 3
0 0.1686 0.0028 0.1055 0.1739
1 0.5721 0.5576 0.2904 0.2205
2 0.0715 0.2749 0.4404 0.5014
3 0.1878 0.1647 0.1637 0.1042
0 1.0
1 1.0
2 1.0
3 1.0
dtype: float64