我有一个看起来像这样的数据框。我创建了3个新列,这些列将从其他列中获取值。我希望功能列将各列分开,并为每个用户获取每个功能的总工作时间。
User Function Total hours Damage Processing problem solve damages sweeper
schae Damage Processing 9.36
Julie Problem solve 9.70
John sweeper 18.9
Dan Damages 1.83
Dan Damages 1.83
Julie Damages 1.83
Dan Problem solve 1.83
预期输出看起来像
User Function Total hours Damage Processing problem solve damages sweeper
schae Damage Processing 9.36 9.36
Julie Problem solve 9.70 9.70
John sweeper 18.9 18.9
Dan Damages 1.83 1.83
Dan sweeper 1.83 1.83
Julie Damages 1.83 1.83
Dan Problem solve 1.83 1.83
我想到了pd.melt,但是它抛出了一个错误值var不存在
res = pd.melt(result,id_vars = ['Function'],value_vars=['Total hours'])
答案 0 :(得分:1)
这是使用get_dummies
和df.assign
的方法:
out = (df[['User','Function','Total hours']].assign(**pd.get_dummies(df['Function'])
.mul(df['Total hours'],axis=0).replace(0,np.nan)))
print(out)
User Function Total hours Damage Processing Damages \
0 schae Damage Processing 9.36 9.36 NaN
1 Julie Problem solve 9.70 NaN NaN
2 John sweeper 18.90 NaN NaN
3 Dan Damages 1.83 NaN 1.83
4 Dan Damages 1.83 NaN 1.83
5 Julie Damages 1.83 NaN 1.83
6 Dan Problem solve 1.83 NaN NaN
Problem solve sweeper
0 NaN NaN
1 9.70 NaN
2 NaN 18.9
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 1.83 NaN
答案 1 :(得分:0)
for i in range(len(df)):
col = df.loc[i]['Function']
df.at[i, col] = df.xs(i)['Total hours']
print(col)
试试看!
变量col
查找您要为其插入值Total hours
的列。