Question

我有一个大型数据框，这是一个简化的例子：

df1<- data.frame(nest = c(1:12),
            plot = rep(c("a", "a", "a","b", "b", "b"), times = 2),
            year = rep(c(2015, 2016, 2017), times = 4),
            treatment = rep(c("Control", "Trap","Control","Trap","Control","Control"), times = 2))

，并提供：

 nest plot year treatment
  1    a  2015   Control
  2    a  2016      Trap
  3    a  2017   Control
  4    b  2015      Trap
  5    b  2016   Control
  6    b  2017   Control
  7    a  2015   Control
  8    a  2016      Trap
  9    a  2017   Control
 10    b  2015      Trap
 11    b  2016   Control
 12    b  2017   Control

我想根据以下内容创建一个新列prevTrap：

按情节分组，如果治疗是前一年的陷阱，prevTrap = 1，否则为0
如果年= 2015

（对于同一地块/年份组合中的多个巢穴）

期望的结果：

 nest plot year treatment  prevTrap
  1    a  2015   Control       0
  2    a  2016      Trap       0
  3    a  2017   Control       1
  4    b  2015      Trap       0
  5    b  2016   Control       1
  6    b  2017   Control       0
  7    a  2015   Control       0
  8    a  2016      Trap       0
  9    a  2017   Control       1
 10    b  2015      Trap       0
 11    b  2016   Control       1
 12    b  2017   Control       0

我尝试了以下代码的不同变体，这导致所有prevTrap值= 0

df2<- df1 %>%
group_by(plot) %>%
mutate(prevTrap = ifelse(treatment == "Trap" &
                        year == year - 1, 
                        "1", "0"))

我应该将年份视为一个因素还是数字？

Answer 1

找到一个不受数据帧排序影响的解决方案：

#filter to get list of plots that were TRAP 2015 
Trap2015<-filter(df1, year == 2015 & treatment == "Trap")  
Trap2015plots<-droplevels(Trap2015$plot) 
Trap2015plots

上面显然会返回一个级别，＆＃34; b＆＃34;，但是对于更大的数据集，会生成一个列表，可以输入到下一部分代码中。我在2016年做了同样的事情（未显示）

#create prevTrap column
df2<- df1 %>%
      mutate(prevTrap = ifelse(df1$plot %in% c("b") & #2015 plots = Trap
                         as.character(year) == "2016" |
                         df1$plot %in% c("a") & #2016 plots = Trap
                         as.character(year) == "2017",
                         "1", "0"))

Answer 2

这适用于您的示例数据框，但只有在您的大型数据集以相同的方式构建时才会起作用，即年在组内排序，组由其他组分隔（abab ...）

我还将数据框命名为df1，以避免与df()函数混淆。

library(tidyverse)
df1 %>%
  group_by(plot) %>% 
  mutate(prevTrap = ifelse(lag(treatment) == "Trap", "1", "0")) %>%
  ungroup() %>% 
  replace_na(list("prevTrap" = 0))

来自多个组和条件的新列

2 个答案: