以下数据框:
id participate grade year
1 1 NA 4 1982
2 1 1 4 1982
3 1 4 4 1982
4 4 NA NA 1987
5 5 NA NA 1986
6 5 NA 1 1986
7 5 NA 1 1986
8 7 NA 2 1984
9 7 4 2 1984
10 7 1 2 1984
11 9 NA 1 1987
12 9 1 1 1987
13 10 NA NA 1984
14 10 NA 2 1984
15 10 4 2 1984
16 11 NA 4 1985
17 11 1 4 1985
18 13 NA 3 1985
19 13 1 3 1985
我的目标是识别并删除每组(id)"参与" is.na,但只有"参与"填充在该组中的其他行。
这意味着在这种情况下:删除第1行,id = 1。 对于id = 4,我不会删除,因为组内没有更多信息。对于id = 5也是如此。 应删除第8,11,13,14行等
这是所需的输出。
id participate grade year
1 1 1 4 1982
2 1 4 4 1982
3 4 NA NA 1987
4 5 NA NA 1986
5 5 NA 1 1986
6 5 NA 1 1986
7 7 4 2 1984
8 7 1 2 1984
9 9 1 1 1987
10 10 4 2 1984
11 11 1 4 1985
12 13 1 3 1985
答案 0 :(得分:1)
# Load package
library(tidyverse)
# Create example dataset
dat <- data_frame(id = c(1L, 1L, 1L, 4L, 5L,
5L, 5L, 7L, 7L, 7L,
9L, 9L, 10L, 10L, 10L,
11L, 11L, 13L, 13L),
participate = c(NA, 1L, 4L, NA, NA,
NA, NA, NA, 4L, 1L,
NA, 1L, NA, NA, 4L,
NA, 1L, NA, 1L),
grade = c(4L, 4L, 4L, NA, NA,
1L, 1L, 2L, 2L, 2L,
1L, 1L, NA, 2L, 2L,
4L, 4L, 3L, 3L),
year = c(1982, 1982, 1982, 1987, 1986,
1986, 1986, 1984, 1984, 1984,
1987, 1987, 1984, 1984, 1984,
1985, 1985, 1985, 1985))
# Filter the data
dat2 <- dat %>%
group_by(id) %>%
filter(!is.na(participate) | all(is.na(participate)))
# See the result
dat2
Source: local data frame [12 x 4]
Groups: id [8]
id participate grade year
<int> <int> <int> <dbl>
1 1 1 4 1982
2 1 4 4 1982
3 4 NA NA 1987
4 5 NA NA 1986
5 5 NA 1 1986
6 5 NA 1 1986
7 7 4 2 1984
8 7 1 2 1984
9 9 1 1 1987
10 10 4 2 1984
11 11 1 4 1985
12 13 1 3 1985