将多行合并到数据框列的一行

时间:2019-03-28 07:32:16

标签: python pandas dataframe

我当前的数据框如下所示,

              0           1         2
0  HA-567034786  AB-1018724      None
1    AB-6348403  HA-7298656      None

使用apply()之后,我就像这样

def make_dict(row):
    s = set(x for x in row if x)
    return {x: list(s - {x}) for x in s}

result = df.apply(make_dict, axis=1).to_frame(name = 'duplicates')

                                duplicates
1    {'HA-567034786': ['AB-1018724'],'AB-1018724':['HA-567034786']}                                                                            
2    {'AB-6348403': ['HA-7298656'],'HA-7298656':['AB-6348403']}   

现在,我坚持要像下面这样制作一个三维词典,

{
  'HA-567034786': ['AB-1018724'],'AB-1018724':['HA-567034786'],
  'AB-6348403': ['HA-7298656'],'HA-7298656':['AB-6348403']
}  

2 个答案:

答案 0 :(得分:1)

相反,apply结合使用字典理解和拼合:

print (df)
              0           1
0  HA-567034786  AB-1018724
1    AB-6348403  HA-7298656

def make_dict(row):
    s = set(x for x in row if x)
    return {x: list(s - {x}) for x in s}

result = {k:v for x in df.values for k, v in make_dict(x).items()}

print (result)
{'HA-567034786': ['AB-1018724'],
 'AB-1018724': ['HA-567034786'], 
 'HA-7298656': ['AB-6348403'],
 'AB-6348403': ['HA-7298656']}

使用apply的另一种解决方案:

result = {k:v for x in df.apply(make_dict, axis=1) for k, v in x.items()}

答案 1 :(得分:1)

您还可以使用collections.ChainMap()将所有词典归为一类:

from collections import ChainMap
res =dict(ChainMap(*result['duplicates']))