在R中的多列中选择具有complete.cases的行

时间:2018-07-17 15:48:54

标签: r

让我们从一些数据开始,使示例可重复:

x <- structure(list(DC1 = c(5, 5, NA, 5, 4, 6, 5, NA, 4, 6, 6, 6, 
5, NA, 5, 5, 7), DC2 = c(4, 7, 4, 5, NA, 4, 6, 4, 4, 5, 5, 5, 
5, NA, 6, 5, 5), DC3 = c(4, 7, 4, 4, NA, 4, 5, 4, 5, 4, 5, 5, 
6, 4, 6, 6, 5), DC4 = c(4, 7, 5, NA, NA, 4, 6, 5, 5, 4, 3, 4, 
6, 5, 5, 6, 3), DC5 = c(7, 8, 5, NA, NA, 10, 7, 6, 8, 6, 6, 7, 
11, 10, 5, 7, 6), DC6 = c(8, 8, NA, NA, NA, 11, 9, 8, 9, 9, 10, 
10, 12, 16, 6, 8, 9), DC7 = c(10, 10, 10, NA, NA, 8, 9, 8, 13, 
8, 11, 9, 14, 13, 8, 8, 11), DC8 = c(17, 10, 10, NA, NA, 10, 
10, 10, 15, 10, 14, 11, 23, 15, 14, 13, 14), DC9 = c(16, 9, 9, 
NA, NA, 12, 13, 11, 13, 15, 15, 13, 17, 15, 25, 17, 12)), .Names = c("DC1", 
"DC2", "DC3", "DC4", "DC5", "DC6", "DC7", "DC8", "DC9"), class = "data.frame", row.names = c(NA, 
-17L))

如何过滤数据框,使包含从DC3列到DC10列的数据的行保留?

3 个答案:

答案 0 :(得分:3)

这是一个dplyr选项:

library(dplyr)
x %>% 
  filter_at(vars(DC3:DC9), all_vars(!is.na(.)))

或:

x %>% 
  filter_at(vars(DC3:DC9), all_vars(complete.cases(.)))

这是一个tidyr选项:

x %>% tidyr::drop_na(DC3:DC9)

答案 1 :(得分:2)

我们可以对数据进行子集化并应用complete.cases

x[complete.cases(x[3:9]),]

或使用列名

x[complete.cases(x[paste0("DC", 3:9)]),]

答案 2 :(得分:2)

您可以使用软件包str_extract中的函数stringr,该函数可以从数据框中的列名提取数字。

# get number from column name
col_num <- as.numeric(stringr::str_extract(names(x), "\\d"))

# rows that contain data from column DC3 to DC10
x[(col_num >= 3) & (col_num < 10)]

编辑笔记: 要安装stringr,请使用install.packages("stringr")