比较主数据帧和子数据帧并仅基于两个列值提取新行

时间:2019-06-16 04:12:52

标签: python python-3.x pandas dataframe

我有两个数据框:

Master_DF:

Symbol,Strike_Price,C_BidPrice,Pecentage,Margin_Req,Underlay,C_LTP,LotSize
JETAIRWAYS,110.0,1.25,26.0,105308.9,81.05,1.2,2200
JETAIRWAYS,120.0,1.0,32.0,96156.9,81.05,1.15,2200
PCJEWELLER,77.5,0.95,27.0,171217.0,56.95,1.3,6500
PCJEWELLER,80.0,0.8,29.0,161207.0,56.95,0.95,6500
PCJEWELLER,82.5,0.55,31.0,154772.0,56.95,0.95,6500
PCJEWELLER,85.0,0.6,33.0,147882.0,56.95,0.7,6500
PCJEWELLER,90.0,0.5,37.0,138977.0,56.95,0.55,6500

和Child_DF:

Symbol,Strike_Price,C_BidPrice,Pecentage,Margin_Req,Underlay,C_LTP,LotSize
JETAIRWAYS,110.0,1.25,26.0,105308.9,81.05,1.2,2200
JETAIRWAYS,150.0,1.3,22.0,44156.9,81.05,1.05,2200
PCJEWELLER,77.5,0.95,27.0,171217.0,56.95,1.3,6500
PCJEWELLER,100.0,1.8,29.0,441207.0,46.95,4.95,6500

我想基于column(Symbol,Strike_Price)将child_DF与master_DF进行比较,即如果master_DF中已经有Symbol&Strike_Price了,那么它将不被视为新数据。

新行是:

Symbol,Strike_Price,C_BidPrice,Pecentage,Margin_Req,Underlay,C_LTP,LotSize
JETAIRWAYS,150.0,1.3,22.0,44156.9,81.05,1.05,2200
PCJEWELLER,100.0,1.8,29.0,441207.0,46.95,4.95,6500

2 个答案:

答案 0 :(得分:1)

您可以使用权利mergeindicator=True,然后使用query'right_only',最后使用reindex()来按子顺序获得列:

(master.merge(child,on=['Symbol','Strike_Price'],how='right',
          suffixes=('_',''),indicator=True)
    .query('_merge=="right_only"')).reindex(child.columns,axis=1)

       Symbol  Strike_Price  C_BidPrice  Pecentage  Margin_Req  Underlay  \
2  JETAIRWAYS         150.0         1.3       22.0     44156.9     81.05   
3  PCJEWELLER         100.0         1.8       29.0    441207.0     46.95   

   C_LTP  LotSize  
2   1.05     2200  
3   4.95     6500  

答案 1 :(得分:0)

  1. 首先合并符号和strike_price设置指标= True和how ='right'的两个数据框

result = result[result['_merge']=='right_only']

  1. 然后从_merge列中对right_only进行过滤以获取所需的结果

    mouseenter

    Code snippet