删除dplyr管道中的NA

时间:2014-10-30 23:43:07

标签: r dplyr na

我尝试使用dplyr管道从子集中删除NA。我的回答是错过了一步的迹象。我正在尝试学习如何使用dplyr编写函数:

> outcome.df%>%
+ group_by(Hospital,State)%>%
+ arrange(desc(HeartAttackDeath,na.rm=TRUE))%>%
+ head()
Source: local data frame [6 x 5]
Groups: Hospital, State
                           Hospital State HeartAttackDeath
1     ABBEVILLE AREA MEDICAL CENTER    SC               NA
2        ABBEVILLE GENERAL HOSPITAL    LA               NA
3      ABBOTT NORTHWESTERN HOSPITAL    MN             12.3
4   ABILENE REGIONAL MEDICAL CENTER    TX             17.2
5        ABINGTON MEMORIAL HOSPITAL    PA             14.3
6 ABRAHAM LINCOLN MEMORIAL HOSPITAL    IL               NA
Variables not shown: HeartFailureDeath (dbl), PneumoniaDeath
  (dbl)

1 个答案:

答案 0 :(得分:119)

我不认为desc会提出na.rm论点......我真的很惊讶,当你给它一个时,它不会抛出错误。如果您只想删除NA,请使用na.omit(基数)或tidyr::drop_na

outcome.df %>%
  na.omit() %>%
  group_by(Hospital, State) %>%
  arrange(desc(HeartAttackDeath)) %>%
  head()

library(tidyr)
outcome.df %>%
  drop_na() %>%
  group_by(Hospital, State) %>%
  arrange(desc(HeartAttackDeath)) %>%
  head()

如果您只想从HeartAttackDeath列中删除NA,请使用is.na进行过滤,或使用tidyr::drop_na

outcome.df %>%
  filter(!is.na(HeartAttackDeath)) %>%
  group_by(Hospital, State) %>%
  arrange(desc(HeartAttackDeath)) %>%
  head()

outcome.df %>%
  drop_na(HeartAttackDeath) %>%
  group_by(Hospital, State) %>%
  arrange(desc(HeartAttackDeath)) %>%
  head()

正如在dupe中指出的那样,complete.cases也可以使用,但放入链中有点棘手,因为它将数据帧作为参数但返回索引向量。所以你可以像这样使用它:

outcome.df %>%
  filter(complete.cases(.)) %>%
  group_by(Hospital, State) %>%
  arrange(desc(HeartAttackDeath)) %>%
  head()