Question

我有这个数据框：

id <- c(1, 1, 2, 2, 3, 3)
x <- c(0, 0, 0, 0, 0, 0)
y <- c(NA, 5, 5, 5, NA, 5)
t <- c(1, 2, 1, 2, 1, 2)

df <- data.frame(id, t, x, y)
df

  id t x  y
1  1 1 0 NA
2  1 2 0  5
3  2 1 0  5
4  2 2 0  5
5  3 1 0 NA
6  3 2 0  5

id和t在两个时间点涉及三种情况。 x和y是一些随机值。现在，我想向x中的向量t = 2添加9，但前提是y中的t = 1为NA。

输出应如下所示：

> df
  id t x  y
1  1 1 0 NA
2  1 2 9  5
3  2 1 0  5
4  2 2 0  5
5  3 1 0 NA
6  3 2 9  5

我非常感谢您的帮助。此外，ifelse的解决方案将是很好的选择。

Answer 1

我假设您想按组进行此操作。

如果id上的x + 9为t ==2，则在每个y上将t==1添加到NA。

library(dplyr)
df %>%
   group_by(id) %>%
   mutate(x = ifelse(is.na(y[t==1]) & t == 2, x + 9, x))

#    id     t     x     y
#  <dbl> <dbl> <dbl> <dbl>
#1    1.    1.    0.   NA 
#2    1.    2.    9.    5.
#3    2.    1.    0.    5.
#4    2.    2.    0.    5.
#5    3.    1.    0.   NA 
#6    3.    2.    9.    5.

Answer 2

您可以创建满足条件的idvar和选定的id，然后分配值。

idvar = df$id[df$t == 1 & is.na(df$y)]
df$x[df$id %in% idvar & df$t == 2] = +9
df
  id t x  y
1  1 1 0 NA
2  1 2 9  5
3  2 1 0  5
4  2 2 0  5
5  3 1 0 NA
6  3 2 9  5

Answer 3

假设始终有两个时间点，并且它们按照示例中的顺序进行排序，则为ifelse：

df$x <- sapply(1:nrow(df), function(z) ifelse(df$t[z] == 2 & is.na(df$y[z-1]) == TRUE
   , df$x[z]+9, df$x[z])

Answer 4

这里是data.table

的一个选项

library(data.table)
setDT(df)[shift(t == 1 & is.na(y)) & t == 2, x := x + 9, id]
df
#   id t x  y
#1:  1 1 0 NA
#2:  1 2 9  5
#3:  2 1 0  5
#4:  2 2 0  5
#5:  3 1 0 NA
#6:  3 2 9  5

使用ifelse编辑具有多个时间点的数据框

4 个答案: