在R中的数据框中选择NA obs

时间:2018-07-13 12:17:35

标签: r dplyr

说,这是我的数据

data=structure(list(x1 = structure(c(1L, 7L, 2L, 8L, 4L, 5L, 11L, 
9L, 3L, 6L, 10L), .Label = c("1270", "14130", "2030", "29910", 
"310", "3160", "570", "620", "7520", "960", "na"), class = "factor"), 
    x2 = structure(c(6L, 2L, 7L, 6L, 4L, 3L, 4L, 1L, 5L, 6L, 
    2L), .Label = c("10", "11", "12", "4", "8", "9", "na"), class = "factor"), 
    x3 = structure(c(4L, 3L, 2L, 5L, 9L, 7L, 7L, 8L, 1L, 5L, 
    6L), .Label = c("2000", "2006", "2007", "2008", "2009", "2011", 
    "2013", "2014", "na"), class = "factor"), Date = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "26.11.2014", class = "factor"), 
    Sales = c(5577L, 5919L, 6911L, 13307L, 5640L, 6555L, 11430L, 
    6401L, 8072L, 6350L, 10031L), id = 1:11), .Names = c("x1", 
"x2", "x3", "Date", "Sales", "id"), class = "data.frame", row.names = c(NA, 
-11L))

我需要获取带有id + Date观察值的数据框 至少有一个NA(缺失值) 在此示例中,输出为

Date       id
26.11.2014  3
26.11.2014  5
26.11.2014  7

如何做到?

1 个答案:

答案 0 :(得分:3)

您不会缺少任何值,但是"na"个字符串,因此我们首先将其转换:

data[data=="na"] <- NA
data[!complete.cases(data),]
#      x1   x2   x3       Date Sales id
# 3 14130 <NA> 2006 26.11.2014  6911  3
# 5 29910    4 <NA> 26.11.2014  5640  5
# 7  <NA>    4 2013 26.11.2014 11430  7

要保持您的"na"值不变,请执行以下操作:

data[rowSums(data =="na") >0,]
#      x1 x2   x3       Date Sales id
# 3 14130 na 2006 26.11.2014  6911  3
# 5 29910  4   na 26.11.2014  5640  5
# 7    na  4 2013 26.11.2014 11430  7