我的数据集示例有10行而不是5行,因为具有明显模式的NA。
将像这样的数据集从10行折叠到5的有效方法是什么?
df <- structure(list(variable = c("am", "am", "carb", "carb", "cyl",
"cyl", "disp", "disp", "drat", "drat"), mean_stat = c(0.40625,
NA, 2.8125, NA, 6.1875, NA, 230.721875, NA, 3.5965625, NA), sd_stat = c(NA,
0.498990917235846, NA, 1.61519997763185, NA, 1.78592164694654,
NA, 123.938693831382, NA, 0.534678736070971)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L), .Names = c("variable",
"mean_stat", "sd_stat"))
答案 0 :(得分:3)
假设您想为每个变量取第一个非NA值:
df %>% group_by(variable) %>% summarise_all(~ first(na.omit(.)))
# A tibble: 5 x 3
# variable mean_stat sd_stat
# <chr> <dbl> <dbl>
#1 am 0.406250 0.4989909
#2 carb 2.812500 1.6152000
#3 cyl 6.187500 1.7859216
#4 disp 230.721875 123.9386938
#5 drat 3.596563 0.5346787
答案 1 :(得分:2)
你几乎肯定应该解决上游的问题。由于转置操作不好,我看到了这样的输出。
话虽如此,可以使用tidyr包删除NA:
library(tidyr)
df %>%
gather(key = "var", value = "val", mean_stat, sd_stat) %>%
drop_na %>%
spread(var, val)