根据另一个数据框中的值在数据框中创建列表列

时间:2020-02-20 12:17:41

标签: python python-3.x pandas

我有两个DataFrames

df1:

       node        ids
0   ab          [978]
1   bc          [978, 121]

df2:

       name        id
0   alpha          978
1   bravo          121

我想在 df1 中添加一个名为 names 的新列,在该列中,我可以找到与id列相对应的名称列表

   node            ids             names
0   ab            [978]            [alpha]
1   bc            [978, 121]       [alpha,bravo]

需要帮助。

2 个答案:

答案 0 :(得分:4)

如果两个id值都是整数(或两个字符串,相同类型),则使用:

d = df2.set_index('id')['name'].to_dict()
df1['names'] = [[d.get(y) for y in x] for x in df1['ids']]
print (df1)
  node         ids           names
0   ab       [978]         [alpha]
1   bc  [978, 121]  [alpha, bravo]

如果列表中不匹配df2['id']的可能值被替换为一些不匹配的值:

d = df2.set_index('id')['name'].to_dict()
df1['names'] = [[d.get(y, 'no match') for y in x] for x in df1['ids']]
print (df1)
  node         ids              names
0   ab   [978, 10]  [alpha, no match]
1   bc  [978, 121]     [alpha, bravo]

或者可以省略以下值:

d = df2.set_index('id')['name'].to_dict()
df1['names'] = [[d[y] for y in x if y in d.keys()] for x in df1['ids']]
print (df1)
  node         ids           names
0   ab   [978, 10]         [alpha]
1   bc  [978, 121]  [alpha, bravo]

答案 1 :(得分:0)

您如何尝试这种替代解决方案?

df1 = (df1.reset_index()).merge(
        ((df1['ids'].explode().reset_index()).merge(
                df2,how='left',left_on='ids',right_on='id').groupby('index')['name','ids'].agg(
                        lambda x: list(x)).reset_index()),
                how='left',on='index').drop(
                        columns=['index','ids_y']).rename(
                                columns={'ids_x':'ids'})
print(df1)

输出:

  node         ids            name
0   ab       [978]         [alpha]
1   bc  [978, 121]  [alpha, bravo]