使用idxmax的最大值

时间:2016-12-07 20:53:47

标签: python max difference

我试图计算夏季金牌数和冬季金牌数相对于金牌总数的最大差异。问题是我只需要考虑夏季和冬季至少赢得一枚金牌的国家。

  

黄金:夏季金牌数

     

Gold.1:冬季金牌数

     

Gold.2:总黄金

这是我的数据样本:

            Gold    Gold.1  Gold.2  ID  diff gold %
Afghanistan 0       0       0       AFG NaN
Algeria     5       0       5       ALG 1.000000
Argentina   18      0       18      ARG 1.000000
Armenia     1       0       1       ARM 1.000000
Australasia 3       0       3       ANZ 1.000000
Australia   139     5       144     AUS 0.930556
Austria     18      59      77      AUT 0.532468
Azerbaijan  6       0       6       AZE 1.000000
Bahamas     5       0       5       BAH 1.000000
Bahrain     0       0       0       BRN NaN
Barbados    0       0       0       BAR NaN
Belarus     12      6       18      BLR 0.333333

这是我的代码,但却给出了错误的答案:

def answer():
    Gold_Y = df2[(df2['Gold'] > 1) | (df2['Gold.1'] > 1)]
    df2['difference'] = (df2['Gold']-df2['Gold.1']).abs()/df2['Gold.2']
    return df2['diff gold %'].idxmax()

answer()  

8 个答案:

答案 0 :(得分:1)

在使用正确的(您的)函数和变量名称进行修改后,请尝试使用此代码。我是Python的新手,但我认为问题在于你必须在第4行使用相同的变量(df1 ['差异']),然后添加方法(.idxmax()) ) 到最后。我不认为你需要函数的第一行代码,因为你不使用局部变量(Gold_Y)。仅供参考 - 我不认为我们正在使用相同的数据集。

def answer_three():
    df1['difference'] = (df1['Gold']-df1['Gold.1']).abs()/df1['Gold.2']
    return df1['difference'].idxmax()

answer_three()

答案 1 :(得分:1)

def answer_three():
     atleast_one_gold = df[(df['Gold']>1) & (df['Gold.1']> 1)]
     return ((atleast_one_gold['Gold'] - atleast_one_gold['Gold.1'])/atleast_one_gold['Gold.2']).idxmax()

answer_three()

答案 2 :(得分:0)

def answer_three():
    _df = df[(df['Gold'] > 0) & (df['Gold.1'] > 0)]
    return ((_df['Gold'] - _df['Gold.1']) / _df['Gold.2']).argmax() answer_three()

答案 3 :(得分:0)

这看起来像是一个关于courser课程编程任务的问题 -  “Python中的数据科学简介”

话虽如此,如果你没有作弊“也许”这个错误就在这里:

Gold_Y = df2[(df2['Gold'] > 1) | (df2['Gold.1'] > 1)]

您应该使用&运算符。 |运营商意味着您拥有在夏季或冬季奥运会上获得金牌的国家/地区。

你不应该在你的差异中获得NaN

答案 4 :(得分:0)

def answer_three():
    diff=df['Gold']-df['Gold.1']
    relativegold = diff.abs()/df['Gold.2']
    df['relativegold']=relativegold
    x = df[(df['Gold.1']>0) &(df['Gold']>0) ]
    return x['relativegold'].idxmax(axis=0)

answer_three()

答案 5 :(得分:0)

我对python或整体编程还是一个新手。 因此,我的解决方案将是有史以来最新手! 我喜欢创建变量;因此您会在解决方案中看到很多东西。

    def answer_three:
      a = df.loc[df['Gold'] > 0,'Gold']
           #Boolean masking that only prints the value of Gold that matches the condition as stated in the question; in this case countries who had at least one Gold medal in the summer seasons olympics.

      b = df.loc[df['Gold.1'] > 0, 'Gold.1']
           #Same comment as above but 'Gold.1' is Gold medals in the winter seasons 

      dif = abs(a-b)

           #returns the abs value of the difference between a and b.

      dif.dropna()

           #drops all 'Nan' values in the column.

      tots = a + b

           #i only realised that this step wasn't essential because the data frame had already summed it up in the column 'Gold.2'

      tots.dropna()

      result = dif.dropna()/tots.dropna()

      returns result.idxmax

           # returns the index value of the max result

答案 6 :(得分:0)

def answer_two():
    df2=pd.Series.max(df['Gold']-df['Gold.1'])
    df2=df[df['Gold']-df['Gold.1']==df2]

    return df2.index[0]
answer_two()

答案 7 :(得分:0)

def answer_three():
    return ((df[(df['Gold']>0) & (df['Gold.1']>0 )]['Gold'] - df[(df['Gold']>0) & (df['Gold.1']>0 )]['Gold.1'])/df[(df['Gold']>0) & (df['Gold.1']>0 )]['Gold.2']).argmax()