如何将数据从一个数据帧移动到另一个数据帧

时间:2019-07-16 18:46:50

标签: r dplyr

如果这是重复的问题,我深表歉意。我试图找到我的问题,但是我可能没有使用正确的术语。如果有更好的方法问这个问题,请随时更改此帖子的标题。

我有两个数据框

df <- data.frame("Location" = c("chr1:123", "chr6:2452", "chr8:4352", "chr11:8754", "chr3:76345", "chr7:23454","chr18:23452"),
"Score" = c("tolered(1)", "tolerated(2)", "", "", "deleterious(0.1)", "", "deleterious(0.2)"))

df2 <- data.frame("Location" = c( "chr7:23454", "chr9:243256", "chr8:4352", "chr2:6795452", "chr11:8754","chr18:23452", "chr3:76345"),
                 "Score" = c("", "", "", "", "", "", ""))
  • df在“分数”列中有要保留的位置和值。
  • df2具有来自df的数据以及一些新数据。
  • 我想要df2中任何值的df分数,并 名为df3的新数据框。

所需结果:

df3 <- data.frame("Location" = c( "chr7:23454", "chr9:243256", "chr8:4352", "chr2:6795452", "chr11:8754","chr18:23452", "chr3:76345"),
                  "Score" = c("", "", "", "", "", "deleterious(0.2)", "deleterious(0.1)"))

我只是不确定执行此操作的最佳/最快方法。我不太确定从哪里开始。我觉得您可以使用dplyr来做到这一点,但我从未做过

3 个答案:

答案 0 :(得分:1)

使用left_join()中的dplyr

library(dplyr)
df3 <- df2 %>% 
  dplyr::select(-Score) %>% 
  left_join(df, by = "Location") 

答案 1 :(得分:0)

我能够强加这个。

我从

开始
df3 <- anti_join(df2, df, by = "Location")
df3 <- rbind(df3, df)

但这给了我一些我不想要/不需要的额外数据,所以我用df2过滤掉了

df3 <- df3 %>%
  filter(Location %in% df2$Location)

这不是最漂亮的方法,所以如果其他人有更干净的方法,请随时回答!

答案 2 :(得分:0)

df
  Location Score
1        A     1
2        B     2
3        C    NA
4        D    NA
5        E     5
6        F    NA
7        G     7
df2
  Location Score
1        E    NA
2        F    NA
3        G    NA
4        H    NA
5        I    NA
6        J    NA
7        K    11
df3
  Location Score
1        H    NA
2        I    NA
3        J    NA
4        K    11
5        E     5
6        F    NA
7        G     7

代码

library(dplyr)
df3 <- df2 %>%
    anti_join(df, by = "Location") %>%
    bind_rows(inner_join(df, df2 %>% select(1), by = "Location"))

数据

df <- data.frame("Location" = LETTERS[1:7],
                 "Score" = c(1, 2, NA, NA, 5, NA, 7),
                 stringsAsFactors = FALSE)

df2 <- data.frame("Location" = LETTERS[5:11],
                  "Score" = c(rep(NA, 6), 11),
                  stringsAsFactors = FALSE)