Python在熊猫数据框中的总和检查

时间:2020-03-31 15:45:31

标签: python pandas dataframe

df1:

df1=pd.DataFrame({'id':['val1','val2','val3','val4','val5','val6'],
         'min':['10','10','75','42','20','50'],
         'max':['93','43','122','80','30','105']})

df2:

df2=pd.DataFrame({'id':['val1','val2','val5','val1','val5','val2'],
           'check':['55.4','35.8','93','11.5','23.8','3.22']})

目标是当id与 df1 匹配时在 df2 中对相应的检查列值求和,并检查结果总和是否在 min-max df1中的strong>范围,并更新df2的结果列中的值

输出df:

id            check        result
val1          55.4         positive
val2          35.8         positive
val5          93           positive
val3          10.1         negative
val1          11.5         positive
val5          23.8         positive
val2          3.22         positive

非常感谢!

3 个答案:

答案 0 :(得分:3)

让我们做mergeeval

df=df2.merge(df1,how='left').eval('result=check>min and check < max')
Out[621]: 
     id check min max  result
0  val1  55.4  10  93    True
1  val2  35.8  10  43    True
2  val5    93  20  30   False
3  val1  11.5  10  93    True
4  val5  23.8  20  30    True
5  val2  3.22  10  43    True

答案 1 :(得分:2)

我们可以合并并使用between

(df2.merge(df1, on='id', how='left')
   .assign(result=lambda x: np.where(x.check.between(x['min'],x['max']), 
                                     'positive', 'negative')
          )
   .drop(['min','max'], axis=1)
)

输出:

     id check    result
0  val1  55.4  positive
1  val2  35.8  positive
2  val5    93  negative
3  val1  11.5  positive
4  val5  23.8  positive
5  val2  3.22  positive

答案 2 :(得分:2)

我认为您需要DataFrame.mergeGroupBy.transform。然后使用np.where创建一个新列:

df3 = df2.merge(df1, how='left', on = 'id')
s = df3.groupby('id')['check'].transform('sum')
df2['result']=np.where(s.lt(df3['max']) & s.gt(df3['min']), 'positive', 'negative')
print(df2)

输出df2

     id check    result
0  val1  55.4  positive
1  val2  35.8  positive
2  val5    93  negative
3  val1  11.5  positive
4  val5  23.8  negative
5  val2  3.22  positive