我有一个看起来像这样的数据结构:
idtenifier amount dist_type new_value new_value2
1 1.0 normal
1 2.0 new_value
1 1.0 new_value2
3 1.0 normal
5 3.0 normal
5 23.0 new_value2
2 1.0 normal
我正在寻找这样的结构:
idtenifier amount dist_type new_value new_value2
1 1.0 normal 2.0 1.0
3 1.0 normal 23.0
5 3.0 normal
2 1.0 normal
我觉得我尝试执行此操作的方式效率很低,我什至无法在列中分配值
df['new_value'] = np.nan
for idx, row in df.iterrows():
identifier = row['identifier']
dist_type = row['dist_type']
amount = row['amount']
if idx > 0 and identifier == df.loc[idx-1, 'identifier']:
print(dist_type)
if dist_type == 'new_value':
df.loc[idx-1, 'new_value'] == amount
答案 0 :(得分:1)
我们不需要在这里使用for循环,将数据帧一分为二后,对于dist_type不等于normal的情况,我们先做pivot
,然后再merge
df1=df.loc[df.dist_type=='normal'].copy()
df2=df.loc[df.dist_type!='normal'].copy()
yourdf=df1.merge(df2.pivot('idtenifier','dist_type','amount').reset_index(),how='left')
yourdf
Out[33]:
idtenifier amount dist_type new_value new_value2
0 1 1.0 normal 2.0 1.0
1 3 1.0 normal NaN NaN
2 5 3.0 normal NaN 23.0
3 2 1.0 normal NaN NaN