我想根据ID本身前几年的组值在组变量中填充数据集的NA。
代码的na.locf(newData, na.rm = TRUE)
部分无效。我认为这是因为输入的不是数字。还是另一件事?
有人知道如何解决此问题吗?
for (i in my_data$ID){
newData = my_data[my_data$ID==i,c('ID','Year', 'group')][3]
na.locf(newData,na.rm = TRUE)
}
我的数据集很大。 但我将其作为我需要的示例:
structure(list(ID = c(1L, 2L, 3L, 1L, 1L, 1L), Year = c(2000L,
2000L, 2001L, 2001L, 2002L, 2003L), Group = structure(c(2L, 3L,
2L, 1L, 1L, 4L), .Label = c("", "\"A\"", "\"B\"", "\"C\""), class = "factor")), row.names = c(NA,
6L), class = "data.frame")
结果应该是这样的:
structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L), Year = c(2000L,
2001L, 2002L, 2003L, 2000L, 2002L), Group = structure(c(1L, 1L,
1L, 3L, 2L, 2L), .Label = c("\"A\"", "\"B\"", "\"C\""), class = "factor")), row.names = c(NA,
6L), class = "data.frame")
答案 0 :(得分:4)
因此,正如我所说,您的问题很简单,就是您必须用NA替换空容器。
with(replace(df, df == '', NA), ave(Group, ID, FUN = zoo::na.locf))
#[1] "A" "B" "A" "A" "A" "C"
将其附加回df,
df$Group <- with(replace(df, df == '', NA), ave(Group, ID, FUN = zoo::na.locf))
给出,
ID Year Group 1 1 2000 "A" 2 2 2000 "B" 3 3 2001 "A" 4 1 2001 "A" 5 1 2002 "A" 6 1 2003 "C"
答案 1 :(得分:0)
Base R,结合使用@Sotos和/ replace / ave逻辑:
df$Group <- with(replace(df, df == '', NA),
ave(Group, ID, FUN = function(x){na.omit(x)[cumsum(!is.na(x))]}))
数据:
df <- structure(
list(
ID = c(1L, 2L, 3L, 1L, 1L, 1L),
Year = c(2000L,
2000L, 2001L, 2001L, 2002L, 2003L),
Group = structure(
c(2L, 3L,
2L, 1L, 1L, 4L),
.Label = c("", "\"A\"", "\"B\"", "\"C\""),
class = "factor"
)
),
row.names = c(NA,
6L),
class = "data.frame"
)