我有一个df.apply,它返回一个具有键/值的序列,如下所示(我们将其称为A列):
Col. A
40237 1.871111
40239 1.280556
40240 1.784167
40241 0.049167
40243 0.011389
40244 0.660278
40245 1.512500
我想使用一个单独的列(B列)的值将这些值保存到各个列中。
B列可以包含6个不同的字符串(示例中为4个)。
Col. B
40237 Open
40239 Open
40240 In Progress
40241 Closed
40243 Waiting
40244 In Progress
40245 Waiting
我想根据B列的值将A列的值保存到4列之一。
最终结果是:
Col. A Col. B Open Time In Progress Time Closed Time Waiting Time
40237 1.871111 Open 1.871111 np.nan np.nan np.nan
40239 1.280556 Open 1.280556 np.nan np.nan np.nan
40240 1.784167 In Progress np.nan 1.784167 np.nan np.nan
40241 0.049167 Closed np.nan np.nan 0.049167 np.nan
40243 0.011389 Waiting np.nan np.nan np.nan 0.011389
40244 0.660278 In Progress np.nan 0.660278 np.nan np.nan
40245 1.512500 Waiting np.nan np.nan np.nan 1.512500
现在,我要尽最大努力做到这一点:
for key in output.index:
df.loc[key,(df['Col. B'] + " Time")] = output.loc[key]
但是我的错误是ValueError: cannot index with vector containing NA / NaN values
。我不确定为什么会这样,尽管我的专栏通常确实有很多nan。
答案 0 :(得分:3)
将join
版DataFrame中的pivot
与add_suffix
一起使用:
df = df.join(pd.pivot(df.index, df['Col. B'], df['Col. A']).add_suffix(' Time'))
另一种解决方案是set_index
与unstack
一起使用:
df = df.join(df.set_index('Col. B', append=True)['Col. A'].unstack().add_suffix(' Time'))
print (df)
Col. A Col. B Closed Time In Progress Time Open Time \
40237 1.871111 Open NaN NaN 1.871111
40239 1.280556 Open NaN NaN 1.280556
40240 1.784167 In Progress NaN 1.784167 NaN
40241 0.049167 Closed 0.049167 NaN NaN
40243 0.011389 Waiting NaN NaN NaN
40244 0.660278 In Progress NaN 0.660278 NaN
40245 1.512500 Waiting NaN NaN NaN
Waiting Time
40237 NaN
40239 NaN
40240 NaN
40241 NaN
40243 0.011389
40244 NaN
40245 1.512500