Question

我想基于连续行中的值创建一个新列。我有以下数据框，其中主题列中有两个主题，我想将试验列中的每一行与每个主题前面的行进行比较。

df <- data.frame(subject = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2), 
             trial = c("switch", "noswitch", "switch", "switch", "noswitch", "switch", "switch", "noswitch", "noswitch", "noswitch"))

我正在尝试使用if语句查看一行，然后根据两个连续行的4种可能配置创建一个新列：switch-switch，noswitch-noswitch，noswitch-switch，switch-noswitch。新列将由这四种配置的名称填充：comp，incomp，noswitch_incomp，switch_comp。循环将重新开始新主题，因此第一个索引将是NA，因为没有先前的值。到目前为止，我有以下内容：

for(i in seq.int(unique(df$subject))){ 
  df$results <- if(df$switch == "switch" & lag(df$switch, 1) == "switch"){
    "comp" 
    } else if (df$switch == "noswitch" & lag(df$switch, 1) == "noswitch"){
      "incomp"
    } else if (df$switch == "noswitch" & lag(df$switch, 1) == "switch"){
      "noswitch_incomp"
    } else {
      "switch_comp" 
    }
}

我收到以下错误，我认为这与if语句没有评估其中的参数有关：

   Error in if (df$switch == "switch" & lag(df$switch, 1) == "switch") { : 
  argument is of length zero

我尝试使用带有dplyr的mutate（）来匹配条件，但会发生类似的错误。还有其他功能我可以尝试评估这些条件吗？

Answer 1

您可以使用dplyr::case_when。

df %>% group_by(subject) %>%
  mutate(results = case_when(
    trial == 'switch' & lag(trial) == 'switch' ~ 'comp',
    trial == 'noswitch' & lag(trial) == 'noswitch' ~ 'incomp',
    trial == 'noswitch' & lag(trial) == 'switch' ~ 'noswitch_incomp',
    trial == 'switch' & lag(trial) == 'noswitch' ~ 'switch_comp'
  ))

# # A tibble: 10 x 3
# # Groups:   subject [2]
#    subject trial    results     
#      <dbl> <chr>    <chr>          
#  1      1. switch   NA             
#  2      1. noswitch noswitch_incomp
#  3      1. switch   switch_comp    
#  4      1. switch   comp           
#  5      1. noswitch noswitch_incomp
#  6      2. switch   NA             
#  7      2. switch   comp           
#  8      2. noswitch noswitch_incomp
#  9      2. noswitch incomp         
# 10      2. noswitch incomp

R中循环中顺序行的新列

1 个答案: