i具有以下数据框,其中包含三个组和一个组中的两个值:
df <- data.frame(var1 = c("A","A","B","B","C","C"),
var2 = c(0,1,1,1,2,1)
)
我想根据组中的第二行对它们进行比较,并添加一列带有字符的列。每行都有一些可能性(我认为):
df$var2[1] < df$var2[2] # if TRUE write "N"
df$var2[1] > df$var2[2] # if TRUE write "S"
df$var2[1] == df$var2[2] # if TRUE write "U"
df$var2[2] < df$var2[1] # if TRUE write "N"
df$var2[2] > df$var2[1] # if TRUE write "S"
df$var2[2] == df$var2[1] # if TRUE write "U"
我想对每个组进行测试,并添加一列标记结果:
df <- data.frame(var1 = c("A","A","B","B","C","C"),
var2 = c(0,1,1,1,2,1),
var3 = c("N","S","U","U","S","N")
)
希望有人可以提供帮助!
答案 0 :(得分:1)
根据调整后的规则进行回答:
与dplyr:
library(dplyr)
df %>%
group_by( var1 ) %>%
mutate( var3 = case_when(
var2 < lead(var2) | var2 < lag(var2) ~ "N",
var2 > lead(var2) | var2 > lag(var2) ~ "S",
var2 == lead(var2) | var2 == lag(var2) ~ "U"
))
# A tibble: 6 x 3
# Groups: var1 [3]
var1 var2 var3
<fct> <dbl> <chr>
1 A 0 N
2 A 1 S
3 B 1 U
4 B 1 U
5 C 2 S
6 C 1 N
带有data.table:
library(data.table)
dt <- setDT(df)
dt[, var3 := ifelse(var2 < shift(var2, n=1L, fill=0, type="lead") | var2 < shift(var2, n=1L, fill=0, type="lag"),
"N",
ifelse(var2 == shift(var2, n=1L, fill=0, type="lead") | var2 == shift(var2, n=1L, fill=0, type="lag"),
"U",
"S" )),
by = var1]
dt
var1 var2 var3
1: A 0 N
2: A 1 S
3: B 1 U
4: B 1 U
5: C 2 S
6: C 1 N