Question

我有一个这样的数据框：

 match_id inn1  bat  bowl  runs1 inn2   runs2   is_score_chased
    1     1     KKR  RCB    222  2      82          1
    2     1     CSK  KXIP   240  2      207         1
    8     1     CSK  MI     208  2      202         1
    9     1     DC   RR     214  2      217         1
   33     1     KKR  DC     204  2      181         1

现在，我想通过比较 runs1 和 runs2 中的值来更改 is_score_chased 列中的值。如果runs1＆gt; runs2，那么该行中的相应值应为'yes'，否则它应为否。我尝试了以下代码：

for i in (high_scores1):
  if(high_scores1['runs1']>=high_scores1['runs2']):
      high_scores1['is_score_chased']='yes'
  else:
      high_scores1['is_score_chased']='no'

但它没有用。如何更改列中的值？

Answer 1

您可以更轻松地使用np.where。

high_scores1['is_score_chased'] = np.where(high_scores1['runs1']>=high_scores1['runs2'], 
                                           'yes', 'no')

通常情况下，如果您发现自己尝试在设置列时进行显式迭代，则会有一个类似apply或where的抽象，它会更快，更简洁。

Answer 2

您需要引用您正在迭代数据框的事实，所以;

for i in (high_scores1):
  if(high_scores1['runs1'][i]>=high_scores1['runs2'][i]):
      high_scores1['is_score_chased'][i]='yes'
  else:
      high_scores1['is_score_chased'][i]='no'

Answer 3

这是使用may-i-initialize-a-global-variable-with-the-result-of-a-function-call的好例子。

apply有一个在两列上使用apply的示例。

您可以通过以下方式使其适应您的问题：

def f(x):    
   return 'yes' if x['run1'] > x['run2'] else 'no'

df['is_score_chased'] = df.apply(f, axis=1)

但是，我建议您使用布尔填充您的专栏，以便您可以使其更简单

def f(x):    
   return x['run1'] > x['run2']

并且还使用lambdas，因此您可以在一行中进行

df['is_score_chased'] = df.apply(lambda x: x['run1'] > x['run2'], axis=1)

如何比较同一数据帧的两列？

3 个答案: