Question

我有一个具有这种结构的数据框：

Note.Reco  Reason.Reco  Suggestion.Reco  Contact
9          absent       tomorrow          yes
8                       tomorrow          yes
8          present      today             no
5                       yesterday         no

我想从此数据框中删除所有具有空值的行。

预期结果：

 Note.Reco  Reason.Reco  Suggestion.Reco  Contact
  9          absent       tomorrow          yes
  8          present      today             no

我尝试使用这条r指令：

IRC_DF[!(is.na(IRC_DF$Reason.Reco) | IRC_DF$Reason.Reco==" "), ]

但是我得到了相同的输入数据帧

请问好吗？

谢谢

Answer 1

我们需要将语法更改为

IRC_DF[!(!is.na(IRC_DF$Reason.Reco) & IRC_DF$Reason.Reco==""), ]
#   Note.Reco Reason.Reco Suggestion.Reco Contact
#1         9      absent        tomorrow     yes
#3         8     present           today      no

如果多列具有NA或空格（""），则

IRC_DF[Reduce(`&`, lapply(IRC_DF, function(x) !(is.na(x)|x==""))),]

数据

IRC_DF <- structure(list(Note.Reco = c(9L, 8L, 8L, 5L), Reason.Reco = c("absent", 
 "", "present", ""), Suggestion.Reco = c("tomorrow", "tomorrow", 
 "today", "yesterday"), Contact = c("yes", "yes", "no", "no")), .Names = c("Note.Reco", 
 "Reason.Reco", "Suggestion.Reco", "Contact"), class = "data.frame", row.names = c(NA, 
 -4L))

Answer 2

或使用dplyr的filter功能。

filter(IRC_DF, !is.na(Reason.Reco) | Reason.Reco != "")

Answer 3

将训练数据拟合到单个决策树时，我遇到了相同的错误。但是，一旦我在拆分训练和测试集之前从原始数据中删除了NA值，它就解决了。我猜我们拆分和f时数据不匹配拟合模型。一些步骤： 1：从其他预测变量col移除NA。 2：现在分为训练和测试集。 3：现在训练模型，希望它现在可以解决错误。

使用R过滤数据框中的空行

3 个答案:

数据