如何在找到值的列标题中创建第三列?

时间:2018-07-20 15:56:31

标签: python pandas

我需要遍历df2的列以查找df1 ['Part No']中包含的值。我需要使用在其中找到值的列标题向df1添加新的col。

data1 = {"Part No": ['100', '101', '102'],
        "Desc": ["Blue", "Green", "Red"]}

df1 = pd.DataFrame(data1)
df1 = df1[['Part No', 'Desc']]

data2 = {"col1": ['100', '101', 'a', 'b'], 
        "col2": ['102', 'c', 'd', 'e' ], 
        "col3": ['999', '1', '2', '0' ]}

df2 = pd.DataFrame(data2)

print(df1)
print('\r')
print(df2)
print('\r')

#My expected output:
data3 = {"Part No": ['100', '101', '102'],
        "Desc": ["Blue", "Green", "Red"], 
         "New Col" : ['col1', 'col1', 'col2']}

df3 = pd.DataFrame(data3)
df3 = df3[['Part No', 'Desc', 'New Col']]
print(df3)

2 个答案:

答案 0 :(得分:1)

通过在unstack上使用df2reset_indexrename,您可以在一行上分别包含df2的值和名称所在的列:

(df2.unstack().reset_index(name='Part No')[['level_0','Part No']]
       .rename(columns={'level_0':'New Col'}))
# if you print this, it looks like:
   New Col Part No
0     col1     100
1     col1     101
2     col1       a
3     col1       b
4     col2     102
5     col2       c
6     col2       d
7     col2       e
8     col3     999
9     col3       1
10    col3       2
11    col3       0

mergedf1之间使用df2进行上述操作,例如:

df3 = df1.merge((df2.unstack()
                    .reset_index(name='Part No')[['level_0','Part No']]
                    .rename(columns={'level_0':'New Col'}) ) ,how='left')

然后您得到df3

  Part No   Desc New Col
0     100   Blue    col1
1     101  Green    col1
2     102    Red    col2

编辑:@DSM提供了另一种解决方案,使用df2代替meltunstackreset_index来操纵rename以得到相同的结果:< / p>

df2.melt(value_name="Part No", var_name="New Col")

然后

df3 = df1.merge(df2.melt(value_name="Part No", var_name="New Col") ,how='left')

给出预期的输出。

答案 1 :(得分:0)

使用pythonic代码和所有强大的numpy

import numpy as np    

df1['new col'] = df1['Part No'].apply(lambda x: df2.columns[list(zip(*np.where(df2==x)))[0][1]] )

输出为:

Part No Desc    new col
0   100 Blue    col1
1   101 Green   col1
2   102 Red     col2