比较2个数据帧并添加差异列,Python 3.6

时间:2017-09-23 13:58:41

标签: python pandas dataframe

我的数据框有10列

df1: col1, col2, col3, col4, col5, col6, col7, col8, col9, col10

和另一个有5列的数据框

 df2: col1, col2, col6, col9, col3

我想将df2df1进行比较,并将不存在的df1列添加到df2

这与Compare Pandas dataframes and add column不重复,我不想添加df1中的任何值,只想添加空白列。

2 个答案:

答案 0 :(得分:1)

dfa = pd.DataFrame({'a':[1,2,3], 'b':[5,6,7]})
dfb = pd.DataFrame({'a':[7,7,7], 'c':[4,4,4], 'e':[0,0,0]})

>>> dfa
   a  b
0  1  5
1  2  6
2  3  7
>>> dfb
   a  c  e
0  7  4  0
1  7  4  0
2  7  4  0

找到不同的列

>>> col_diff = dfb.columns.difference(dfa.columns)
>>> col_diff
Index(['c', 'e'], dtype='object')

列出新列并添加它们:

>>> new = col_diff.tolist()
>>> new
['c', 'e']
>>> 
>>> for col in new:
...     dfa[col] = None

>>> dfa
   a  b     c     e
0  1  5  None  None
1  2  6  None  None
2  3  7  None  None
>>>

使用DataFrame.assign(相同的初始DataFrames)

>>> # try it when the df indices are different
>>> dfc = dfb.set_index('a')
>>> dfc
   c  e
a      
7  4  0
7  4  0
7  4  0

>>> diff = dfc.columns.difference(dfa.columns)
>>> new = diff.tolist()
>>> new = {col:None for col in new}
>>> dfa = dfa.assign(**new)

>>> dfa
   a  b     c     e
0  1  5  None  None
1  2  6  None  None
2  3  7  None  None

答案 1 :(得分:0)

要做到这一点,索引必须匹配。假设他们这样做,尝试类似:

pd.concat([df1.drop(df2.columns, axis=1), df2], axis=1)