我很困惑。我试图以多种方式从我的data.frame
/ data.table
中删除NA:na.omit
,dropNA()
(我从StackOverflow找到的函数),complete.cases
,< / p>
dropNA()
:
dropNA <- function(dat) {
dat %>% filter(rowSums(is.na(.)) != ncol(.))
}
我尝试上述方法删除NAs
,但正如您在下面的tibble
中所看到的,NAs仍包含在结果中。
> # drop NAs:
> design_mat4 <- na.omit(design_mat4)
> design_mat4 <- dropNA(design_mat4)
> design_mat4 <- design_mat4[complete.cases(design_mat4), ]
> target_n <- sum(design_mat4$label == 0)
> a <- design_mat4[which(design_mat4$label == 1), ]
> positive_samp = a[sample(x = nrow(design_mat4),
+ size = target_n,
+ replace = TRUE), ]
> positive_samp
# A tibble: 50,447 x 14
email_status score email_is_blacklis~ email_domain_is_bla~ email_domain_blackl~ email_domain_pa~
<fct> <int> <fct> <fct> <fct> <fct>
1 verified 85 0 0 "" not_parked
2 verified 85 1 0 "" not_parked
3 verified 85 0 0 "" not_parked
4 NA NA NA NA NA NA
5 verified 57 1 0 "" not_parked
6 verified 85 0 0 "" no_website_cont~
7 verified 57 1 0 "" not_parked
8 verified 85 0 0 "" not_parked
9 NA NA NA NA NA NA
10 verified 85 0 0 "" not_parked
# ... with 50,437 more rows, and 8 more variables: email_domain_lawsite <fct>, . . ., label <fct>
是否因为tibble
生成有关数据原始状态的摘要统计信息?
最后,我希望删除NAs。请帮忙!
答案 0 :(得分:0)
可能与您在R会话中已加载的其他软件包存在冲突。尝试在使用的函数之前添加包的名称,如下所示:
->withData( array( 'card_number' => '4811111111111114','card_cvv' => '123','card_exp_month' => '01','card_exp_year' => '2020','client_key' => 'xxxxxxx' ) )
library(dplyr)
df <- data_frame(a = c(1, NA, 5, 99), b = c(20, -1, NA, NA))
df %>%
stats::na.omit()