R:不能放弃NAs

时间:2018-04-16 04:04:58

标签: r datatable dplyr tidyr

我很困惑。我试图以多种方式从我的data.frame / data.table中删除NA:na.omitdropNA()(我从StackOverflow找到的函数),complete.cases,< / p>

dropNA()

dropNA <- function(dat) {
  dat %>% filter(rowSums(is.na(.)) != ncol(.))
}

我尝试上述方法删除NAs,但正如您在下面的tibble中所看到的,NAs仍包含在结果中。

> # drop NAs:
> design_mat4 <- na.omit(design_mat4)
> design_mat4 <- dropNA(design_mat4)
> design_mat4 <- design_mat4[complete.cases(design_mat4), ]
> target_n <- sum(design_mat4$label == 0)
> a <- design_mat4[which(design_mat4$label == 1), ]
> positive_samp = a[sample(x       = nrow(design_mat4),
+                          size    = target_n, 
+                          replace = TRUE), ]
> positive_samp
# A tibble: 50,447 x 14
   email_status score email_is_blacklis~ email_domain_is_bla~ email_domain_blackl~ email_domain_pa~
   <fct>        <int> <fct>              <fct>                <fct>                <fct>           
 1 verified        85 0                  0                    ""                   not_parked      
 2 verified        85 1                  0                    ""                   not_parked      
 3 verified        85 0                  0                    ""                   not_parked      
 4 NA              NA NA                 NA                   NA                   NA              
 5 verified        57 1                  0                    ""                   not_parked      
 6 verified        85 0                  0                    ""                   no_website_cont~
 7 verified        57 1                  0                    ""                   not_parked      
 8 verified        85 0                  0                    ""                   not_parked      
 9 NA              NA NA                 NA                   NA                   NA              
10 verified        85 0                  0                    ""                   not_parked      
# ... with 50,437 more rows, and 8 more variables: email_domain_lawsite <fct>, . . ., label <fct>

是否因为tibble生成有关数据原始状态的摘要统计信息?

最后,我希望删除NAs。请帮忙!

1 个答案:

答案 0 :(得分:0)

可能与您在R会话中已加载的其他软件包存在冲突。尝试在使用的函数之前添加包的名称,如下所示:

->withData( array( 'card_number' => '4811111111111114','card_cvv' => '123','card_exp_month' => '01','card_exp_year' => '2020','client_key' => 'xxxxxxx' ) )

A tibble:1 x 2

library(dplyr)

df <- data_frame(a = c(1, NA, 5, 99), b = c(20, -1, NA, NA))
df %>%
 stats::na.omit()