我有一个数据集,每个人有2个观察结果。 数据集中有100多个变量。 我想用同一变量的可用数据填写每个人的缺失数据。我可以使用dplyr mutate函数手动执行此操作,但对于需要填充的所有变量执行此操作将非常麻烦。
这是我尝试过的,但它失败了:
> # Here's data example
> # https://www.dropbox.com/s/a0bc69xgxhaeguc/data_xlsc.xlsx?dl=0
> # I have already attached it to my working space
>
> names(data)
[1] "ID" "Age" "var1" "var2" "var3" "var4" "var5" "var6" "var7" "var8" "var9"
> head(data)
Source: local data frame [6 x 11]
ID Age var1 var2 var3 var4 var5 var6 var7 var8 var9
1 1 50 27.5 1.83 92.0 NA NA NA NA NA 5.1
2 1 NA NA NA NA 3.54 30.2 27.9 64.34 60.8 NA
3 2 51 33.7 1.77 105.6 NA NA NA NA NA 5.2
4 2 NA NA NA NA 4.05 36.4 38.7 67.75 63.7 NA
5 3 43 26.3 1.84 89.1 NA NA NA NA NA 4.8
6 3 NA NA NA NA 3.77 24.4 21.9 67.97 64.2 NA
> # As you can see above, for each person (ID) there are missing values for age and other variables.
> # I'd like to fill in missing data with the available data for each variable, for each ID
>
> #These are the variables that I need to fill in
> desired_variables <- names(data[,2:11])
>
> # this is my attempt that failed
>
> data2 <- data %>% group_by(ID) %>%
+ do(
+ for (i in seq_along(desired_variables)) {
+ i=max(i, na.rm=T)
+ }
+ )
Error: Results are not data frames at positions: 1, 2, 3
第一个人的理想输出:
ID Age var1 var2 var3 var4 var5 var6 var7 var8 var9
1 1 50 27.5 1.83 92.0 3.54 30.2 27.9 64.34 60.8 5.1
2 1 50 27.5 1.83 92.0 3.54 30.2 27.9 64.34 60.8 5.1
答案 0 :(得分:5)
这是一个可能的data.table
解决方案
library(data.table)
setattr(data, "class", "data.frame") ## If your data is of `tbl_df` class
setDT(data)[, (desired_variables) := lapply(.SD, max, na.rm = TRUE), by = ID] ## you can also use `.SDcols` if you want to specify specific columns
data
# ID Age var1 var2 var3 var4 var5 var6 var7 var8 var9
# 1: 1 50 27.5 1.83 92.0 3.54 30.2 27.9 64.34 60.8 5.1
# 2: 1 50 27.5 1.83 92.0 3.54 30.2 27.9 64.34 60.8 5.1
# 3: 2 51 33.7 1.77 105.6 4.05 36.4 38.7 67.75 63.7 5.2
# 4: 2 51 33.7 1.77 105.6 4.05 36.4 38.7 67.75 63.7 5.2
# 5: 3 43 26.3 1.84 89.1 3.77 24.4 21.9 67.97 64.2 4.8
# 6: 3 43 26.3 1.84 89.1 3.77 24.4 21.9 67.97 64.2 4.8