根据Pandas Python中另外两个单独数据框的列名创建列值

时间:2017-03-07 20:18:30

标签: python pandas dataframe

我有两个Pandas Dataframe,我希望能够通过将第一个df中的column1值与第二个df中的相应值匹配来创建Result列(黄色)。 df中的Column1值是对第二个df中列名称的引用。

enter image description here

enter image description here

2 个答案:

答案 0 :(得分:1)

您可以在df2上使用melt和set_index

df1 = pd.DataFrame({'Date': ['12/30/2016', '12/30/2016', '1/31/2017', '1/31/2017'], 'col1': ['APB', 'UPB', 'APB', 'UPB']})
df2 = pd.DataFrame({'Date': ['12/30/2016', '1/31/2017', '2/28/2017', '3/31/2017'], 'APB': [117, 112.8, 112.37, 112.23], 'UPB': [67.9, 67.8, 66.7, 66.9]})



df2 = pd.melt(df2, id_vars='Date', value_vars=['APB', 'UPB'])
df2['Date'] = pd.to_datetime(df2['Date'])
df2.sort_values(by = 'Date').set_index('Date')

这会给你

    variable    value
Date        
2016-12-30  APB 117.00
2016-12-30  UPB 67.90
2017-01-31  APB 112.80
2017-01-31  UPB 67.80
2017-02-28  APB 112.37

现在您可以合并两个数据框

df1 = df1.merge(df2, left_on = 'col1', right_on = 'variable').drop_duplicates().drop('variable', axis = 1).sort_values(by = 'Date')

那会给你

    col1    Date    value
0   APB 2016-12-30  117.00
8   UPB 2016-12-30  67.90
1   APB 2017-01-31  112.80
9   UPB 2017-01-31  67.80
2   APB 2017-02-28  112.37
10  UPB 2017-02-28  66.70
3   APB 2017-03-31  112.23
11  UPB 2017-03-31  66.90

答案 1 :(得分:1)

正如评论中所述,meltmerge是一种很好的方法:

import pandas as pd

df1 = pd.DataFrame({'Date': ['12/30/2016', '12/30/2016', '1/31/2017', '1/31/2017'], 
                    'Column1': ['APB', 'UPB', 'APB', 'UPB']})

df2 = pd.DataFrame({'Date': ['12/30/2016', '1/31/2017', '2/28/2017', '3/31/2017'], 
                   'APB': [117, 112.8, 112.37, 112.23], 
                   'UPB': [67.925, 67.865, 66.717, 66.939]})

melted = pd.melt(df2, id_vars="Date", var_name="Column1", value_name="Result")
merged = df1.merge(melted, on=["Date", "Column1"])

print(merged)

  Column1        Date   Result
0     APB  12/30/2016  117.000
1     UPB  12/30/2016   67.925
2     APB   1/31/2017  112.800
3     UPB   1/31/2017   67.865