我需要遍历df2的列以查找df1 ['Part No']中包含的值。我需要使用在其中找到值的列标题向df1添加新的col。
data1 = {"Part No": ['100', '101', '102'],
"Desc": ["Blue", "Green", "Red"]}
df1 = pd.DataFrame(data1)
df1 = df1[['Part No', 'Desc']]
data2 = {"col1": ['100', '101', 'a', 'b'],
"col2": ['102', 'c', 'd', 'e' ],
"col3": ['999', '1', '2', '0' ]}
df2 = pd.DataFrame(data2)
print(df1)
print('\r')
print(df2)
print('\r')
#My expected output:
data3 = {"Part No": ['100', '101', '102'],
"Desc": ["Blue", "Green", "Red"],
"New Col" : ['col1', 'col1', 'col2']}
df3 = pd.DataFrame(data3)
df3 = df3[['Part No', 'Desc', 'New Col']]
print(df3)
答案 0 :(得分:1)
通过在unstack
上使用df2
,reset_index
和rename
,您可以在一行上分别包含df2
的值和名称所在的列:
(df2.unstack().reset_index(name='Part No')[['level_0','Part No']]
.rename(columns={'level_0':'New Col'}))
# if you print this, it looks like:
New Col Part No
0 col1 100
1 col1 101
2 col1 a
3 col1 b
4 col2 102
5 col2 c
6 col2 d
7 col2 e
8 col3 999
9 col3 1
10 col3 2
11 col3 0
在merge
和df1
之间使用df2
进行上述操作,例如:
df3 = df1.merge((df2.unstack()
.reset_index(name='Part No')[['level_0','Part No']]
.rename(columns={'level_0':'New Col'}) ) ,how='left')
然后您得到df3
:
Part No Desc New Col
0 100 Blue col1
1 101 Green col1
2 102 Red col2
编辑:@DSM提供了另一种解决方案,使用df2
代替melt
,unstack
和reset_index
来操纵rename
以得到相同的结果:< / p>
df2.melt(value_name="Part No", var_name="New Col")
然后
df3 = df1.merge(df2.melt(value_name="Part No", var_name="New Col") ,how='left')
给出预期的输出。
答案 1 :(得分:0)
使用pythonic代码和所有强大的numpy
:
import numpy as np
df1['new col'] = df1['Part No'].apply(lambda x: df2.columns[list(zip(*np.where(df2==x)))[0][1]] )
输出为:
Part No Desc new col
0 100 Blue col1
1 101 Green col1
2 102 Red col2