如何比较2个数据框列并根据结果将值添加到新数据框

时间:2019-08-29 12:42:04

标签: pandas compare

我有2个长度相同的数据框,我想比较它们之间的特定列。如果其中一个数据框的第一列的值较大-我希望它取第二列的值并将其分配给新的数据框。 参见示例。第一个数据框:

       0   class
0    1.9       0
1    9.8       0
2    4.5       0
3    8.1       0
4    1.9       0

第二个数据帧:

       0   class
0    1.4       1
1    7.8       1
2    8.5       1
3    9.1       1
4    3.9       1

新数据框应如下所示:

  class
0     0
1     0
2     1
3     1
4     1

3 个答案:

答案 0 :(得分:3)

numpy.whereDataFrame构造函数一起使用:

df = pd.DataFrame({'class': np.where(df1[0] > df2[0], df1['class'], df2['class'])})

DataFrame.where

df = df1[['class']].where(df1[0] > df2[0], df2[['class']])

print (df)
   class
0      0
1      0
2      1
3      1
4      1

编辑:

如果还有其他情况,请使用numpy.select,如有必要,请使用numpy.isclose

print (df2)
     0  class
0  1.4      1
1  7.8      1
2  8.5      1
3  9.1      1
4  1.9      1


masks = [df1[0] == df2[0], df1[0] > df2[0]]
#if need compare floats in some accuracy
#masks = [np.isclose(df1[0], df2[0]), df1[0] > df2[0]]
vals = ['not_determined', df1['class']]
df = pd.DataFrame({'class': np.select(masks, vals, df2['class'])})
print (df)
            class
0               0
1               0
2               1
3               1
4  not_determined

或者:

masks = [df1[0] == df2[0], df1[0] > df2[0]]
vals = ['not_determined', 1]
df = pd.DataFrame({'class': np.select(masks, vals, 1)})
print (df)
            class
0               0
1               0
2               1
3               1
4  not_determined

开箱即用的解决方案:

df = np.sign(df1[0].sub(df2[0])).map({1:0, -1:1, 0:'not_determined'}).to_frame('class')
print (df)
            class
0               0
1               0
2               1
3               1
4  not_determined

答案 1 :(得分:2)

由于类是0和1,您可以尝试

def resolve

    @issue = Issue.find(params[:issue_id])  
    @patient = Patient.find(params[:patient_id])

    if @issue.update_attributes(:resolved => Time.now, :resolved_by => current_user.id, :resolved_note => "Hard coded test message")
        flash[:success] = "Resolved " + @issue.title
    else
        flash[:error] = "Failed to resolve " + @issue.title
    end

    respond_to do |format|
        format.js { render :refresh  } # this will look for a file names create.js.erb in views/links directory
    end

end

有关通用解决方案,请查看jezrael的答案。

答案 2 :(得分:1)

尝试这个:

>>> import numpy as np
>>> import pandas as pd
>>> df_1
     0  class
0  1.9      0
1  9.8      0
2  4.5      0
3  8.1      0
4  1.9      0
>>> df_2
     0  class
0  1.4      1
1  7.8      1
2  8.5      1
3  9.1      1
4  3.9      1
>>> df_3=pd.DataFrame()
>>> df_3["class"]=np.where(df_1["0"]>df_2["0"], df_1["class"], df_2["class"])
>>> df_3
   class
0      0
1      0
2      1
3      1
4      1