使用purrr中的map提取以另一个变量为条件的值

时间:2018-10-22 09:05:10

标签: r purrr

我正在尝试将值从多个数据框映射到一个主数据框。

下面的示例部分起作用,而我在最后一部分遇到问题

library(tidyverse)
library(purrr)
library(data.table)

# main data
eid <- c(111,333,555,777,999)
value <-c(121,135,565,400,450)
dat <- as.data.frame(cbind(eid,value),stringsAsFactors=F)

# data from mi to be mapped to main data
eid <- c(111,222,444)
date <- c(134,234,213)
mi <- as.data.frame(cbind(eid,mi.value),stringsAsFactors=F)

# data from cva to be mapped to main data
eid <- c(333,444,555,666)
date <- c(124,132,125,457)
cva <-as.data.frame(cbind(eid,cva.value),stringsAsFactors=F)

# using map to see if eid in 'mi' and 'cva' appear in main data


each.subsequent <- map(list(mi,cva),~
                     as.integer(dat$eid %in% .x$eid))
names(each.subsequent) <- c("mi","cva")
each.subsequent <- as.data.frame(each.subsequent) 

这下一位不起作用

# maps the numerical value next to the eid
each.subsequent.value <- map(list(mi,cva),~
    ifelse (dat$eid == .x$eid, .x$date,NA))

我找到了一种使用右联接的方法,但这需要大量的代码编写。所以我有两个问题:

1)是否有一种“映射”方式从匹配eid的mi和cva数据框中提取$ date值?

2)上面代码中的〜和.x的作用是什么?

所需的输出应该是

structure(list(eid = c(111, 333, 555, 777, 999), value = c(121, 
135, 565, 400, 450), mi = c(1L, 0L, 0L, 0L, 0L), cva = c(0L, 
1L, 1L, 0L, 0L), mi.date = c(134, NA, NA, NA, NA), cva.date = c(NA, 
124, 125, NA, NA)), .Names = c("eid", "value", "mi", "cva", "mi.date", 
"cva.date"), row.names = c(NA, -5L), class = "data.frame") 

1 个答案:

答案 0 :(得分:1)

您可以轻松地使用两个left_join来完成此操作,除非我遗漏了一些东西(也许还有更多data.frames):

dat %>% 
  left_join(mi, by ="eid") %>% 
  left_join(cva, by ="eid")
#   eid value mi.value cva.value
# 1 111   121      134        NA
# 2 333   135       NA       124
# 3 555   565       NA       125
# 4 777   400       NA        NA
# 5 999   450       NA        NA

编辑:

如果您有更多data.frames,请使用reduce

list(dat, mi, cva) %>% reduce(left_join, by = "eid")