我想将v1到v4中的na替换为相同列的中位数
以下是一些示例数据
id <- c(1,2,3,4)
v1 <- c(1,3,0,2)
v2 <- c(NA,1,NA,2)
v3 <- c(2,4,1,2)
v4 <- c(NA,1,0,2)
v5 <- c(5,1,NA,2)
v6 <- c(7,1,9,NA)
df <- data.frame(id, v1, v2, v3,v4,v5,v6)
df_pre <- df %>% group_by(id) %>% mutate(Median_v1_v4 = median(c(v1,v2,v3,v4), na.rm=TRUE))
现在的数据如下:
id v1 v2 v3 v4 v5 v6 Median_v1_v4
1 1 NA 2 NA 5 7 1.5
2 3 1 4 1 1 1 2.0
3 0 NA 1 0 NA 9 0.0
4 2 2 2 2 2 NA 2.0
这就是我希望数据看起来像
id v1 v2 v3 v4 v5 v6 Median_v1_v4
1 1 1.5 2 1.5 5 7 1.5
2 3 1.0 4 1.0 1 1 2.0
3 0 0.0 1 0.0 NA 9 0.0
4 2 2.0 2 2.0 2 NA 2.0
答案 0 :(得分:0)
该解决方案如何?
df[,2:5] <- t( apply(df[,2:5], 1, function(x) {
x[is.na(x)] <- median(x,na.rm=T)
return(x)}
) )
df
id v1 v2 v3 v4 v5 v6
1 1 1 1.0 2 1 5 7
2 2 3 1.0 4 1 1 1
3 3 0 0.5 1 0 NA 9
4 4 2 2.0 2 2 2 NA
调整自:Replace NA values by row means
PS:看到评论太晚了(@Sai Saran),这是对上面链接中解决方案的调整。
答案 1 :(得分:0)
您可以尝试
library(tidyverse)
df %>%
gather(k, v, -id) %>%
group_by(id) %>%
mutate(Median=median(v[k %in% c("v1", "v2", "v3","v4")], na.rm = T)) %>%
mutate(v=ifelse(is.na(v) & k %in% c("v1", "v2", "v3","v4"), Median, v)) %>%
spread(k, v)
# A tibble: 4 x 8
# Groups: id [4]
id Median v1 v2 v3 v4 v5 v6
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1.5 1 1.5 2 1.5 5 7
2 2 2 3 1 4 1 1 1
3 3 0 0 0 1 0 NA 9
4 4 2 2 2 2 2 2 NA
答案 2 :(得分:0)
看看这段代码。
mydatamatrix