我有一个标头为“ Category”,“ Factor1”,“ Factor2”,“ Factor3”,“ Factor4”,“ UseFactorA”,“ UseFactorB”的数据框。
“ UseFactorA”和“ UseFactorB”的值是字符串['Factor1','Factor2','Factor3','Factor4']之一,基于“类别”中的值进行键控。
我想生成一列“结果”,该列等于dataframe [UseFactorA] / dataframe [UseFactorB]
以下面的数据框为例:
[Category] [Factor1] [Factor2] [Factor3] [Factor4] [useFactor1] [useFactor2]
A 1 2 5 8 'Factor1' 'Factor3'
B 2 7 4 2 'Factor3' 'Factor1'
“结果”系列应为[2,.2]
但是,我无法弄清楚如何将useFactor1和useFactor2的值提供给索引以实现此目的-如果要使用的列是固定的,我只会给出
df['Result'] = df['Factor1']/df['Factor2']
但是,当我尝试捐赠
df['Results'] = df[df['useFactorA']]/df[df['useFactorB']]
我得到了错误
ValueError: Wrong number of items passed 3842, placement implies 1
有什么方法可以做我在这里尝试的事情?
答案 0 :(得分:1)
可能不是最漂亮的解决方案(由于存在迭代),但是想到的是遍历一系列因素并在每个索引处设置“结果”值:
for i, factors in df[['UseFactorA', 'UseFactorB']].iterrows():
df.loc[i, 'Result'] = df[factors['UseFactorA']] / df[factors['UseFactorB']]
编辑:
另一个选择:
def factor_calc_for_row(row):
factorA = row['UseFactorA']
factorB = row['UseFactorB']
return row[factorA] / row[factorB]
df['Result'] = df.apply(factor_calc_for_row, axis=1)
答案 1 :(得分:1)
这是一支班轮:
df['Results'] = [df[df['UseFactorA'][x]][x]/df[df['UseFactorB'][x]][x] for x in range(len(df))]
它是如何工作的:
df[df['UseFactorA']]
返回数据帧,
df[df['UseFactorA'][x]]
返回系列
df[df['UseFactorA'][x]][x]
从序列中提取一个值。