purrr的地图有效,但furrr的未来地图无效

时间:2018-11-20 18:14:55

标签: r tidyverse purrr

在以下代码中,来自purrr的map_dfr起作用,但是来自furrr的future_map_dfr引发错误。我该如何解决?

    #install.packages("randomForest"); install.packages("tidyverse"); install.packages("iml")
    library(tidyverse); library(iml); library(randomForest) 
    library(furrr)

    plan(multiprocess)

    set.seed(42)

    mtcars1 <- mtcars %>%  mutate(vs = as.factor(vs),
                                  id = row_number())

x <- "vs"
y <- paste0(setdiff(setdiff(names(mtcars1), "vs"), "id"), collapse = "+")

rf = randomForest(as.formula(paste0(x, "~ ", y)), data = mtcars1, ntree = 50)

predictor <- Predictor$new(rf, data = mtcars1, y = mtcars1$vs)

# Results using map_dfr() from purrr
shapelyresults <- map_dfr(1:nrow(mtcars), ~(Shapley$new(predictor, x.interest = mtcars1[.x,]) %>% 
                                              .$results %>% 
                                              as_tibble() %>% 
                                              arrange(desc(phi)) %>% 
                                              slice(1:5) %>% 
                                              select(feature.value, phi) %>%
                                              mutate(id = .x)))

# Attempt to use future_map_dfr() from furrr
f_shapelyresults <- future_map_dfr(1:nrow(mtcars), ~(Shapley$new(predictor, x.interest = mtcars1[.x,]) %>% 
                                              .$results %>% 
                                              as_tibble() %>% 
                                              arrange(desc(phi)) %>% 
                                              slice(1:5) %>% 
                                              select(feature.value, phi) %>%
                                              mutate(id = .x)))

1 个答案:

答案 0 :(得分:1)

根据您的配置,使用furrr

future可以使用映射到不同CPU内核或线程的R子进程及其各自的环境/作用域。

根据我的经验,通常会出现两种类型的问题:

  1. 包并非总是由子流程附加。
  2. 子流程并不总是可以访问对象。

因此,您可能会:
-将purrr lambda函数重写为命名函数,并在函数顶部抛出require()调用以排除第一种类型的问题。
-在命名函数中,还将辅助数据作为参数传递。

尝试这样的事情:

library(furrr)

my_function <-
  function(primary_object, Shapely_object) {

    require(tidyverse); require(iml); require(randomForest) 

    Shapley_object$new(predictor, 
                       x.interest = mtcars1[primary_object, ]) %>%
      .$results %>%
      as_tibble() %>%
      arrange(desc(phi)) %>%
      slice(1:5) %>%
      select(feature.value, phi) %>%
      mutate(id = primary_object))
  }

f_shapelyresults <- 
  future_map_dfr(
   .x = 1:nrow(mtcars), # 1st argument: primary_object, above
   .f = my_function,
   Shapely_object = Shapely # 2nd argument, as seen above
  )