我有以下pandas
个数据帧:
import pandas as pd
import numpy as np
df = pd.DataFrame({'par': ['par_scl', 'par_scl', 'par_scl', 'par_loc','par_pot'],
'var': ['all', 'all', 'all', 'all','var1'],
'terc': [1, 2, 3, np.nan,np.nan],
'value': [1,2,3,4,5]})
par var terc value
0 par_scl all 1.0 1
1 par_scl all 2.0 2
2 par_scl all 3.0 3
3 par_loc all NaN 4
4 par_pot var1 NaN 5
vars_df = pd.DataFrame({'var': ['var1', 'var2']})
var
0 var1
1 var2
我想结束
final_df = pd.DataFrame({'par':['par_scl','par_scl','par_scl','par_scl','par_scl','par_scl','par_loc','par_loc','par_pot'],
'var':['var1','var2','var1','var2','var1','var2','var1','var2','var1'],
'terc':[1,1,2,2,3,3,np.nan,np.nan,np.nan],
'value':[1,1,2,2,3,3,4,4,5]})
par var terc value
0 par_scl var1 1.0 1
1 par_scl var2 1.0 1
2 par_scl var1 2.0 2
3 par_scl var2 2.0 2
4 par_scl var1 3.0 3
5 par_scl var2 3.0 3
6 par_loc var1 NaN 4
7 par_loc var2 NaN 4
8 par_pot var1 NaN 5
换句话说:如果var
是all
,那么我想为每个var
组合terc
和par
,否则只是外部合并。
是否有一种有效的方法?
答案 0 :(得分:2)
一种解决方案是通过过滤后的行创建交叉联接,删除辅助列并添加没有all
值的行:
df1 = df[df['var'] == 'all'].assign(a=1).merge(vars_df.assign(a=1), on='a', suffixes=('_',''))
df2 = (df1.drop(['var_','a'], axis=1)
.append(df[df['var'] != 'all'], sort=False, ignore_index=True))
print (df2)
par terc value var
0 par_scl 1.0 1 var1
1 par_scl 1.0 1 var2
2 par_scl 2.0 2 var1
3 par_scl 2.0 2 var2
4 par_scl 3.0 3 var1
5 par_scl 3.0 3 var2
6 par_loc NaN 4 var1
7 par_loc NaN 4 var2
8 par_pot NaN 5 var1