根据R中的多个条件创建列

时间:2018-08-10 04:05:45

标签: r dataframe dplyr

我有一个包含3列的数据框:个人ID,行程(按ID排序)和草料(是或否):

example <- data.frame(IDs = c(rep("A",30),rep("B",30)), 
                  timestamp = seq(c(ISOdate(2016,10,01)), by = "day", length.out = 60),
                  trip = c(rep("1",15),rep("2",15)), 
                  forage = c(rep("Yes",3),rep("No",5),rep("Yes",3),rep("No",4),rep("Yes",7),rep("No",8)))

我想创建两个单独的列,其中将列出每次观察的觅食事件。在第一列中,我想为ID和行程中的觅食=“ yes”编号每个观察值(因此,个人中的每个行程将有x次觅食事件,对于个人中的下一个行程,将从“ 1”重新开始) 。该列如下所示:

example$forageEvent1 <- c(rep(1,3),rep("NA",5),rep(2,3),rep("NA",4),rep(1,7),rep("NA",8),rep(1,3),rep("NA",5),rep(2,3),rep("NA",4),rep(1,7),rep("NA",8))

第二列将仅通过ID对觅食事件进行编号:

example$forageEvent2 <- c(rep(1,3),rep("NA",5),rep(2,3),rep("NA",4),rep(3,7),rep("NA",8),rep(1,3),rep("NA",5),rep(2,3),rep("NA",4),rep(3,7),rep("NA",8))

我可以将子集/管道分解为个人,然后跳闸并尝试了ifelse(),但不知道如何编写将创建事件序列的代码。谢谢大家。

编辑:下面的代码(从注释中编辑)接近。但是,它以“ Forage0”而不是“ Forage1”开头打印。

library(dplyr)
Test_example <- example %>%
  group_by(IDs) %>%
  mutate(
  ForagebyID = case_when(
   forage == "Yes" ~ "Forage",
   forage == "No" ~"NonForage"),
  rleid = cumsum(ForagebyID != lag(ForagebyID, 1, default = "NA")), 
 ForagebyID = case_when(
  ForagebyID == "Forage" ~ paste0(ForagebyID, rleid %/% 2),
  TRUE ~ "NonForage"),
rleid = NULL
)

1 个答案:

答案 0 :(得分:1)

我认为这将满足您的要求

library(dplyr)

example <- data.frame(IDs = c(rep("A",30),rep("B",30)), 
                      timestamp = seq(c(ISOdate(2016,10,01)), by = "day", length.out = 60),
                      trip = c(rep("1",15),rep("2",15)), 
                      forage = c(rep("Yes",3),rep("No",5),rep("Yes",3),rep("No",4),rep("Yes",7),rep("No",8)))

Test_example <- example %>%
  arrange(IDs, timestamp) %>%
  group_by(IDs, trip) %>%
  mutate(forageEvent1 = case_when(forage == "No" ~ 0,
                                  TRUE ~ cumsum(forage != lag(forage, default = 1)) %/% 2 + 1)) %>%
  group_by(IDs) %>%
  mutate(forageEvent2 = case_when(forage == "No" ~ 0,
                                  TRUE ~ cumsum(forage != lag(forage, default = 1)) %/% 2 + 1))