R:根据外部条件从数据框中删除行

时间:2017-07-31 19:36:27

标签: r dataframe

我有两个数据框df.1df.2,我想根据df.2的某些内容是否属实,从df.1中删除行。具体来说,我想删除df.2中与df.1中与feistiness对应的date df.2NA# create first data frame dates <- rep(as.Date(5001:5010, origin = "1970-01-01"), times = 4) dogs <- c(rep("Fido", times = 10), rep("Snoopy", times = 10), rep("Speckles", times = 10), rep("Pit", times = 10)) set.seed(200) feistiness <- c(round(runif(35, min = 0, max = 100), digits = 0), rep(NA, times = 5)) df.1 <- data.frame(dates, dogs, feistiness) names(df.1) <- c("date", "dog", "feistiness") 的所有行。怎么去做这个? (我已经看过其他问题但仍然无法解决这个问题。)

第一个数据框的可重现代码:

         date     dog feistiness
1  1983-09-11    Fido         56
2  1983-09-12    Fido         18
3  1983-09-13    Fido         97
4  1983-09-14    Fido         49
5  1983-09-15    Fido         49
6  1983-09-16    Fido         59
7  1983-09-17    Fido         72
8  1983-09-18    Fido         69
9  1983-09-19    Fido         18
10 1983-09-20    Fido         95
11 1983-09-11  Snoopy         69
12 1983-09-12  Snoopy         16
13 1983-09-13  Snoopy         58
14 1983-09-14  Snoopy         65
15 1983-09-15  Snoopy         83
16 1983-09-16  Snoopy          7
17 1983-09-17  Snoopy         12
18 1983-09-18  Snoopy         89
19 1983-09-19  Snoopy         56
20 1983-09-20  Snoopy         52
21 1983-09-11 Speckles         13
22 1983-09-12 Speckles         15
23 1983-09-13 Speckles         16
24 1983-09-14 Speckles         56
25 1983-09-15 Speckles         67
26 1983-09-16 Speckles         15
27 1983-09-17 Speckles         57
28 1983-09-18 Speckles         76
29 1983-09-19 Speckles         57
30 1983-09-20 Speckles         78
31 1983-09-11     Pit         68
32 1983-09-12     Pit         22
33 1983-09-13     Pit         28
34 1983-09-14     Pit          9
35 1983-09-15     Pit         59
36 1983-09-16     Pit         NA
37 1983-09-17     Pit         NA
38 1983-09-18     Pit         NA
39 1983-09-19     Pit         NA
40 1983-09-20     Pit         NA

哪个收益率:

# create second data frame
dates.2 <- as.Date(c(5002, 5005, 5004, 5009), origin = "1970-01-01")
dogs.2 <- c("Fido", "Snoopy", "Speckles", "Pit")
df.2 <- data.frame(dates.2, dogs.2)
names(df.2) <- c("date", "dog")

第二个数据框:

        date      dog
1 1983-09-12     Fido
2 1983-09-15   Snoopy
3 1983-09-14 Speckles
4 1983-09-19      Pit

哪个收益率:

feistiness

最终输出数据框应如下所示,删除最后一行,因为1983-09-19 Pitt的{​​{1}}值为NA

        date      dog
1 1983-09-12     Fido
2 1983-09-15   Snoopy
3 1983-09-14 Speckles

1 个答案:

答案 0 :(得分:1)

我们可以使用anti_join中的dplyrdf_final是最终输出。

library(dplyr)

df_final <- df.2 %>%
  anti_join(df.1 %>% filter(is.na(feistiness)), by = c("date", "dog"))