删除具有NA值的行,并在另一年删除这些观察值

时间:2017-12-14 12:27:34

标签: r dplyr

我觉得找到合适的词语对我想做的事情有点困难。

说我有这个数据帧:

library(dplyr)

# A tibble: 74 x 3
       country  year conf_perc
         <chr> <dbl>     <dbl>
 1      Canada  2017        77
 2      France  2017        45
 3     Germany  2017        60
 4      Greece  2017        33
 5     Hungary  2017        67
 6       Italy  2017        38
 7      Canada  2009        88
 8      France  2009        91
 9     Germany  2009        93
10      Greece  2009        NA
11     Hungary  2009        NA
12       Italy  2009        NA

现在我想要删除2009年具有NA值的行,但我想在2017年删除这些国家/地区的行。我想得到以下结果:

# A tibble: 74 x 3
       country  year conf_perc
         <chr> <dbl>     <dbl>
 1      Canada  2017        77
 2      France  2017        45
 3     Germany  2017        60
 4      Canada  2009        88
 5      France  2009        91
 6     Germany  2009        93

2 个答案:

答案 0 :(得分:5)

我们可以在按国家&#39;

分组后library(dplyr) df1 %>% group_by(country) %>% filter(!any(is.na(conf_perc))) # A tibble: 6 x 3 # Groups: country [3] # country year conf_perc # <chr> <int> <int> #1 Canada 2017 77 #2 France 2017 45 #3 Germany 2017 60 #4 Canada 2009 88 #5 France 2009 91 #6 Germany 2009 93
()=>

答案 1 :(得分:2)

base R解决方案:

foo <- df$year == 2009 & is.na(df$conf_perc) 
bar <- df$year == 2017 & df$country %in% unique(df$country[foo])
df[-c(which(foo), which(bar)), ]

#   country year conf_perc
# 1  Canada 2017        77
# 2  France 2017        45
# 3 Germany 2017        60
# 7  Canada 2009        88
# 8  France 2009        91
# 9 Germany 2009        93