我有两个从excel读取的数据框。
df1
SRD CIVF Test Case
0 9530\n3678\n549 CIV-016
1 9979\n9980 CIV-040
2 5231\n4455 CIV-177
df2
SRD SRD CR
0 549\n9980 CR181
1 4455 CR170
2 5231\n9979 CR190
对于df1,我想添加第三列,以指示“SRD CR'引用SRD列中列出的相同SRD的数字。不确定我是否应该使用地图'或者'添加'熊猫的功能。 df3中显示的数据帧基本上就是我要找的。 DF3将写回excel。此外,我希望在同一个单元格(对于Excel)中保留多个值会很棘手。
df3
SRD CIVF Test Case SRD CR
0 9530\n3678\n549 CIV-016 CR181
1 9979\n9980 CIV-040 CR190\nCR181
2 5231\n4455 CIV-177 CR170\nCR190
答案 0 :(得分:0)
除非有充分理由将SRD
值隐藏在单行字符串中,否则我会转换df1
和df2
,以便每行都有一个SRD
值。然后您可以合并SRD
:
# Split all strings between '\n' into their own columns
split1 = df1['SRD'].str.split('\\\\n', expand=True)
split2 = df2['SRD'].str.split('\\\\n', expand=True)
split1
0 1 2
0 9530 3678 549
1 9979 9980 None
2 5231 4455 None
split2
0 1
0 549 9980
1 4455 None
2 5231 9979
# Concatenate the above split columns onto the right sides of
# the original DFs
catted1 = pd.concat([df1, split1], axis=1)
catted2 = pd.concat([df2, split2], axis=1)
catted1
SRD CIVF Test Case 0 1 2
0 9530\n3678\n549 NaN CIV-016 9530 3678 549
1 9979\n9980 NaN CIV-040 9979 9980 None
2 5231\n4455 NaN CIV-177 5231 4455 None
catted2
SRD SRD CR 0 1
0 549\n9980 CR181 549 9980
1 4455 CR170 4455 None
2 5231\n9979 CR190 5231 9979
# Give each SRD its own row
melted1 = pd.melt(catted1,
id_vars=['CIVF', 'Test Case', 'SRD'],
value_name='Shared_SRDs')\
.drop('variable', axis=1).dropna(subset=['Shared_SRDs'])
melted2 = pd.melt(catted2.drop('SRD', axis=1),
id_vars=['SRD CR'],
value_name='Shared_SRDs')\
.drop('variable', axis=1).dropna()
melted1
CIVF Test Case SRD
0 NaN CIV-016 9530
1 NaN CIV-040 9979
2 NaN CIV-177 5231
3 NaN CIV-016 3678
4 NaN CIV-040 9980
5 NaN CIV-177 4455
6 NaN CIV-016 549
melted2
SRD CR Shared_SRDs
0 CR181 549
1 CR170 4455
2 CR190 5231
3 CR181 9980
5 CR190 9979
# Merge on SRD values
merged = melted1.merge(melted2, on='SRD').sort_values(['Test Case', 'SRD CR']).reset_index(drop=True)
merged
CIVF Test Case SRD Shared_SRDs SRD CR
0 NaN CIV-016 9530\n3678\n549 549 CR181
1 NaN CIV-040 9979\n9980 9980 CR181
2 NaN CIV-040 9979\n9980 9979 CR190
3 NaN CIV-177 5231\n4455 4455 CR170
4 NaN CIV-177 5231\n4455 5231 CR190