错误:只能比较标记相同的DataFrame对象

时间:2019-08-10 16:27:06

标签: python python-3.x pandas dataframe

我有两个数据框:

上一个df:

       Time       FO_SYMBOL  TOTAL_VOLUME
0  14:20:41             ACC        6778.0
1  14:56:57        ADANIENT        4314.0
2  09:19:12      AUROPHARMA        1295.0
3  15:09:14      BAJAJ-AUTO        8339.0
4  09:19:12         HCLTECH        1431.0
5  09:19:12      HEROMOTOCO        1551.0
6  13:53:02      ULTRACEMCO        8284.0

df:

       Time       FO_SYMBOL  TOTAL_VOLUME
0  14:20:41             ACC        6778.0
1  14:56:57        ADANIENT        4314.0
2  09:19:12      AUROPHARMA        1295.0
3  15:09:14      BAJAJ-AUTO        8339.0
4  09:19:12         HCLTECH        1431.0
5  09:19:12      HEROMOTOCO        1551.0
6  13:53:02      ULTRACEMCO        8284.0
7  14:55:12      BHEL              8114.0 <<= NEW ROW
8  14:55:12      BHEL              8120.0 <<= NEW ROW

我想比较两个数据框并找到不同的新行。我希望我的输出如下:

结果:

0  14:55:12      BHEL              8114.0 <<= NEW ROW
1  14:55:12      BHEL              8120.0 <<= NEW ROW

当前我正在使用以下代码:

indexes = (df != prev_df).any(axis=1)
new_df = df.loc[indexes]

但是当在df中填充新行时,出现以下错误:

  

只能比较标记相同的DataFrame对象

请帮助。

2 个答案:

答案 0 :(得分:1)

尝试

df3 = pd.merge(df,prev_df,on='a',how='left',indicator=True)
df3[df3['_merge']=='left_only']
df3.drop(['_merge'],axis=1,inplace=True)

      Time FO_SYMBOL  TOTAL_VOLUME
7  14:55:12      BHEL        8114.0
8  14:55:12      BHEL        8120.0

答案 1 :(得分:1)

您可以concatdrop_duplicates

cols=prev_df.columns.intersection(df.columns).tolist()
pd.concat([df, pd.concat([prev_df]*2)]).drop_duplicates(cols, keep=False)

       Time FO_SYMBOL  TOTAL_VOLUME
7  14:55:12      BHEL        8114.0
8  14:55:12      BHEL        8120.0