如何仅将函数应用于dplyr中组的第一行?

时间:2016-07-13 13:59:09

标签: r dplyr

数据

我有3个变量。 Vehicle.ID2是车辆对的唯一ID,dV是潜在车辆和后续车辆的速度差异,而dA是加速度差异,在一段时间内保持不变。因此,我的分组变量是Vehicle.ID2dA。以下是仅有1 Vehicle.ID2的几行原始数据:

    veh <- structure(list(Vehicle.ID2 = c("907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904"
), dA = c(0.43024, 0.43024, 0.43024, 0.43024, 0.43024, 0.43024, 
0.43024, 0.43024, 0.43024, 0.43024, 0.43024, -0.3162, -0.3162, 
-0.3162, -0.3162, -0.3162, -0.3162, -0.3162, -0.3162, -0.3162, 
-0.3162), dV = c(-0.0427200000000001, 0.11031, 0.22627, 0.30058, 
0.33838, 0.35264, 0.35803, 0.36481, 0.37677, 0.39292, 0.40961, 
0.42206, 0.42557, 0.416090000000001, 0.39003, 0.34668, 0.296580000000001, 
0.268000000000001, 0.29681, 0.399859999999999, 0.554639999999999
)), class = "data.frame", .Names = c("Vehicle.ID2", "dA", "dV"
), row.names = c(NA, -21L))

目标

我想创建一个新列OC_DV。最初,OC_DV的所有值都是"no"。我可以这样做:

veh$OC_DV <- "no"  

现在,首先我要按变量Vehicle.ID2dA拆分数据。然后,对于每个组,我想查看dV的第一个值的符号是否与dV的最后一个值的符号匹配。根据符号匹配或不匹配的条件,我只想修改OC_DV的FIRST值。以下是代码:

OC_DV[1] <- ifelse(sign(head(dV,1))== sign(tail(dV,1)),  "no",
                     ifelse(sign(head(dV,1))==-1 & sign(tail(dV,1))==1, "OPDV",
                            ifelse(sign(head(dV,1))==1 & sign(tail(dV,1))==-1,"CLDV","no")))  

问题

我尝试使用mutatedo,但会产生错误:

    veh <- veh %>% 
  group_by(Vehicle.ID2, dA) %>%
  mutate(OC_DV[1] = ifelse(sign(head(dV,1))== sign(tail(dV,1)),  "no",
                           ifelse(sign(head(dV,1))==-1 & sign(tail(dV,1))==1, "OPDV",
                                  ifelse(sign(head(dV,1))==1 & sign(tail(dV,1))==-1,"CLDV","no")))
  )
Error: unexpected '=' in:
"  group_by(Vehicle.ID2, dA) %>%
  mutate(OC_DV[1] ="



 veh <- veh %>% 
  group_by(Vehicle.ID2, dA) %>%
  do(OC_DV[1] = ifelse(sign(head(dV,1))== sign(tail(dV,1)),  "no",
                           ifelse(sign(head(dV,1))==-1 & sign(tail(dV,1))==1, "OPDV",
                                  ifelse(sign(head(dV,1))==1 & sign(tail(dV,1))==-1,"CLDV","no")))
  )
Error: unexpected '=' in:
"  group_by(Vehicle.ID2, dA) %>%
  do(OC_DV[1] ="

如果我删除[1],则没有错误,但组中的所有值都会更改:

    veh %>% 
  group_by(Vehicle.ID2, dA) %>%
  mutate(OC_DV = ifelse(sign(head(dV,1))== sign(tail(dV,1)),  "no",
                                ifelse(sign(head(dV,1))==-1 & sign(tail(dV,1))==1, "OPDV",
                                       ifelse(sign(head(dV,1))==1 & sign(tail(dV,1))==-1,"CLDV","no")))
  )

如果只更改第一个值,我该怎么做?

期望输出:

    structure(list(Vehicle.ID2 = c("907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904", 
"907-904", "907-904", "907-904", "907-904", "907-904", "907-904"
), dA = c(0.43024, 0.43024, 0.43024, 0.43024, 0.43024, 0.43024, 
0.43024, 0.43024, 0.43024, 0.43024, 0.43024, -0.3162, -0.3162, 
-0.3162, -0.3162, -0.3162, -0.3162, -0.3162, -0.3162, -0.3162, 
-0.3162), dV = c(-0.0427200000000001, 0.11031, 0.22627, 0.30058, 
0.33838, 0.35264, 0.35803, 0.36481, 0.37677, 0.39292, 0.40961, 
0.42206, 0.42557, 0.416090000000001, 0.39003, 0.34668, 0.296580000000001, 
0.268000000000001, 0.29681, 0.399859999999999, 0.554639999999999
), OC_DV = c("OPDV", "no", "no", "no", "no", "no", "no", "no", 
"no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", 
"no", "no")), class = "data.frame", .Names = c("Vehicle.ID2", 
"dA", "dV", "OC_DV"), row.names = c(NA, -21L))

2 个答案:

答案 0 :(得分:2)

有很多SignInManager.GetVerifiedUserIdAsync()

的另一个想法
mutate

其中,

library(dplyr) 
veh %>% 
   group_by(Vehicle.ID2, dA) %>% 
   mutate(id = seq(dV)) %>% 
   mutate(OC_DV = fun1(dV)) %>% 
   mutate(OC_DV = ifelse(id == 1, OC_DV, 'no'))

答案 1 :(得分:1)

这有效:

稍微清晰的条件功能:

fun <- function(x) {
  switch(paste(sign(head(x,1)), sign(tail(x,1))),
         '-1 1' = 'OPDV',
         '1 -1' = 'CLDV',
         'no')
}

然后我们将该函数仅应用于组中第一个的行。

veh %>% 
  group_by(Vehicle.ID2, dA) %>% 
  mutate(OC_DV = if_else(row_number() == 1, fun(dV), 'no'))