Question

我有一个像下面这样的数据框，并且我必须基于{{1}创建一个新列inner join，它等于year_val到col2016的值}列，这样当col2019等于Years的后缀时，year_val的值将是col####的值

Years

Answer 1

将DataFrame.lookup与Years列中的更改值一起使用，并以col开头并强制转换为字符串：

sampleDF['year_val'] = sampleDF.lookup(sampleDF.index, 'col' + sampleDF['Years'].astype(str))

print (sampleDF)
   Years  col2016  col2017  col2018  col2019  year_val
0   2016        1        9       17       25         1
1   2016        2       10       18       26         2
2   2017        3       11       19       27        11
3   2017        4       12       20       28        12
4   2018        5       13       21       29        21
5   2018        6       14       22       30        22
6   2019        7       15       23       31        31
7   2019        8       16       24       32        32

编辑：如果检查lookup函数的定义：

result = [df.get_value（row，col）for row，zip in col（row_labels，col_labels）]

您可以使用带有Series.at的try-except语句进行修改，以防止出现这种情况：

FutureWarning：不建议使用get_value，并将在以后的版本中将其删除。请改用.at []或.iat []访问器 oup.append（sampleDF.at [row，col]）

sampleDF = pd.DataFrame({'Years':[2015,2016,2017,2017,2018,2018,2019,2019],
                        'col2016':[1,2,3,4,5,6,7,8],
                        'col2017':[9,10,11,12,13,14,15,16],
                        'col2018':[17,18,19,20,21,22,23,24],
                        'col2019':[25,26,27,28,29,30,31,32]})

print (sampleDF)
   Years  col2016  col2017  col2018  col2019
0   2015        1        9       17       25
1   2016        2       10       18       26
2   2017        3       11       19       27
3   2017        4       12       20       28
4   2018        5       13       21       29
5   2018        6       14       22       30
6   2019        7       15       23       31
7   2019        8       16       24       32

out= []
for row, col in zip(sampleDF.index, 'col' + sampleDF['Years'].astype(str)):
    try:
        out.append(sampleDF.at[row, col] )
    except KeyError:
        out.append(np.nan)

sampleDF['year_val'] = out
print (sampleDF)
   Years  col2016  col2017  col2018  col2019  year_val
0   2015        1        9       17       25       NaN
1   2016        2       10       18       26       2.0
2   2017        3       11       19       27      11.0
3   2017        4       12       20       28      12.0
4   2018        5       13       21       29      21.0
5   2018        6       14       22       30      22.0
6   2019        7       15       23       31      31.0
7   2019        8       16       24       32      32.0

根据多种条件在pandas数据框中创建一个新列

1 个答案: