r,使用id对的数据帧搜索值的数据帧,以返回配对值

时间:2017-12-19 23:36:05

标签: r

我需要一种方法来读取包含两列id的数据帧,实际上是一个行方式对的数据帧,然后使用这些id对搜索不同的数据帧并返回它们相应的值。

例如,我有以下id对数据框:

A <- c("a", "b", "a")
B <- c("c", "d", "e")

df_pairs <- data.frame(A, B)

> df_pairs
  A B
1 a c
2 b d
3 a e

我有一个相应值的数据框:

id <- c("a", "b", "c", "d", "e")
val <- c("1", "2", "3", "4", "5")

df_values <- data.frame(id, val)

> df_values
  id val
1  a   1
2  b   2
3  c   3
4  d   4
5  e   5

我想返回一个如下所示的数据框:

 A B A_value B_value
1 a c       1       3
2 b d       2       4
3 a e       1       5

我的目的是将其纳入分析管道。请注意,对数和ID的数量会因我的实际数据而异,因此请在您的解决方案中考虑这一点。

3 个答案:

答案 0 :(得分:2)

试试这个:

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
A <- c("a", "b", "a")
B <- c("c", "d", "e")

df_pairs <- data.frame(A, B)

id <- c("a", "b", "c", "d", "e")
val <- c("1", "2", "3", "4", "5")

df_values <- data.frame(id, val)

left_join(df_pairs, df_values, by = c("A" = "id")) %>%
  left_join(df_values, by =c("B"= "id")) %>%
  select(A, B, A_value = val.x, B_value = val.y)
## Warning: Column `A`/`id` joining factors with different levels, coercing to
## character vector
## Warning: Column `B`/`id` joining factors with different levels, coercing to
## character vector
##   A B A_value B_value
## 1 a c       1       3
## 2 b d       2       4
## 3 a e       1       5

HTH

答案 1 :(得分:2)

尝试:

A <- c("a", "b", "a")
B <- c("c", "d", "e")
df_pairs <- data.frame(A, B, stringsAsFactors = FALSE) 

id <- c("a", "b", "c", "d", "e")
val <- c("1", "2", "3", "4", "5")
names(val) <- id

df_quads <- df_pairs
df_quads$A_value <- val[df_pairs$A]
df_quads$B_value <- val[df_pairs$B]

给予

> df_pairs
  A B
1 a c
2 b d
3 a e

> val
  a   b   c   d   e 
"1" "2" "3" "4" "5" 

> df_quads
  A B A_value B_value
1 a c       1       3
2 b d       2       4
3 a e       1       5

虽然注意到你的&#34;值&#34;实际上是人物

答案 2 :(得分:0)

双重合并也可以回答这个问题:

merge(
    merge(df_pairs, 
          df_values, 
          by.x=c("A"), 
          by.y=c("id")
          ), 
    df_values, 
    by.x=c("B"), 
    by.y=c("id")
    )