Python新手!我有一个简单的数据框用于制表测验分数:
df = pd.DataFrame({'Sam':[20,20,20,20,20], 'Jim': [20,20,20,20,15],
'Stacy': [20,20,20,20,30], 'Leslie': [20,20,20,20,20], 'Jonathan':
[20,20,20,20,15]})
现在,我想写一些改变每列最高值(从0开始)的东西,直到它对应的平均值等于预定值,然后移到下一列。很容易添加一个新行并手动完成,直到我得到我想要的结果(如下所示)。但是,我正在寻找一些能让程序在iloc领域进行迭代的东西,以获得' mean2'预定的价值。我想这会需要某种类型的while循环,但无法弄清楚语法。代码下方最终所需结果的屏幕截图。谢谢!
df.loc['mean1'] = df.mean()
df.iloc[0:1,0:5] = 17, 17, 22, 22, 22
df.loc['mean2'] = df.iloc[:5,:].mean()
df
答案 0 :(得分:0)
如果我错了,请纠正我,但如果我重新表述你的问题:
你想要的是为每个特定(列)找到第一个测验得分(df.loc [0])的值,以确保参与者具有平均目标得分(平均值2)?
如果是这种情况,你可以这样做:
# a function that estimate the quiz value to have for obtaining mean score target
def estimate_replace(quiz_id, mean_target, participant_series):
data = participant_series.loc[participant_series.index != quiz_id].values
participant_series['mean1'] = participant_series.mean()
participant_series['mean2'] = mean_target
# Here is the key function!
participant_series.loc[quiz_id] = mean_target*(len(data)+1) - data.sum()
return participant_series
#mean2 : mean scores target per participant
mean_score_target = {'Jim':18.4,
'Jonathan':18.4,
'Leslie':20.4,
'Sam':20.4,
'Stacy':22.4}
#the quiz id to replace, 0 in your case
quiz_id = 0
df = df.apply(lambda x: estimate_replace(quiz_id,mean_score_target[x.name],x))
请注意,此代码适用于任意数量的测验值(行),您可以指定要估算/替换的测验值(quiz_id
)。
然后您将获得以下输出:
Jim Jonathan Leslie Sam Stacy
0 17.0 17.0 22.0 22.0 22.0
1 20.0 20.0 20.0 20.0 20.0
2 20.0 20.0 20.0 20.0 20.0
3 20.0 20.0 20.0 20.0 20.0
4 15.0 15.0 20.0 20.0 30.0
mean1 19.0 19.0 20.0 20.0 22.0
mean2 18.4 18.4 20.4 20.4 22.4