检查类型列表的data.frame列是否为NULL

时间:2016-05-12 07:13:18

标签: r dataframe data.table

如何检查我的列“Subdocuments”的哪些行为NULL,然后返回行不为NULL“Subdocuments”列的类型为“list”。

我试过以下代码行,但是不起作用,类似的东西应该是解决方案。

  newdata<- data[!is.null(data$Subdocuments),]
  newdata<- data[!is.null(unlist(data$Subdocuments)),]
  newdata<- data[!is.na(data$Subdocuments),]
  newdata<- data[!is.na(unlist(data$Subdocuments)),]

这是数据。

 data <- structure(list(ID = c("1", "2", "3"), Country = c("Netherlands", 
        "Germany", "Belgium"), Subdocuments = list(structure(list(Value = c("5", 
        "5", "1", "3", "2", "1", "1", "1", "2", "5", "3", "2", "4", "5", 
        "5", "2"), Label = c("Test1", "Test2", "Test3", "Test4", "Test5", 
        "Test6", "Test7", "Test8", "Test9", "Test10", "Test11", "Test12", 
        "Test13", "Test14", "Test15", "Test16"), Year = c(2001, 2002, 
        2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 
        2014, 2015, 2016)), .Names = c("Value", "Label", "Year"), class = "data.frame", row.names = c(NA, 
        16L)), structure(list(Value = c("5", "4", "3", "2", "2", "2", 
        "1", "1", "5", "4", "4", "4", "5", "1", "1", "3"), Label = c("Test1", 
        "Test2", "Test3", "Test4", "Test5", "Test6", "Test7", "Test8", 
        "Test9", "Test10", "Test11", "Test12", "Test13", "Test14", "Test15", 
        "Test16"), Year = c(2001, 2002, 2003, 2004, 2005, 2006, 2007, 
        2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016)), .Names = c("Value", 
        "Label", "Year"), class = "data.frame", row.names = c(NA, 16L
        )), NULL)), .Names = c("ID", "Country", "Subdocuments"), row.names = c(NA, 
        -3L), class = "data.frame")

  class(data$Subdocuments) # "list"

  is.null(data$Subdocuments[[1]]) # FALSE
  is.null(data$Subdocuments[[2]]) # FALSE
  is.null(data$Subdocuments[[3]]) # TRUE

输出将是data.frame的前两行

使用包data.table的解决方案将是理想的选择。因为我有另一个非常大的数据集,我想要使用这个主体。

1 个答案:

答案 0 :(得分:3)

data.table解决方案:

dt <- as.data.table(data);
dt[!sapply(Subdocuments,is.null)];
##    ID     Country Subdocuments
## 1:  1 Netherlands <data.frame>
## 2:  2     Germany <data.frame>

Base R解决方案:

data[!sapply(data$Subdocuments,is.null),];
##   ID     Country                                                                                                                                                                                                                                                          Subdocuments
## 1  1 Netherlands 5, 5, 1, 3, 2, 1, 1, 1, 2, 5, 3, 2, 4, 5, 5, 2, Test1, Test2, Test3, Test4, Test5, Test6, Test7, Test8, Test9, Test10, Test11, Test12, Test13, Test14, Test15, Test16, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016
## 2  2     Germany 5, 4, 3, 2, 2, 2, 1, 1, 5, 4, 4, 4, 5, 1, 1, 3, Test1, Test2, Test3, Test4, Test5, Test6, Test7, Test8, Test9, Test10, Test11, Test12, Test13, Test14, Test15, Test16, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016