如何使purrr代码与spark.lapply一起工作?

时间:2018-12-11 19:01:37

标签: r apache-spark sparkr

如何通过以某种方式在不同节点上发送xs和ys的组合来并行化计算,从而使此代码与SparkR的spark.lapply一起工作?我相信,它应该在以下代码的tibble(ys, xs)mutate(startformula = paste0(ys, " ~ ", 1)之间开始,但是我不知道如何。

library(tidyverse)
library(broom)

ys <- names(mtcars)

xs <- map(ys, ~setdiff(names(mtcars), .x)) %>% 
  map(~paste0(.x, collapse = "+")) %>%
  unlist()

models <- tibble(ys, xs) %>%
  mutate(startformula = paste0(ys, " ~ ", 1),
         endformula = paste0(ys, " ~ ", xs)) %>% 
  mutate(model = map2(startformula,
                      endformula,
                      ~glm(.x, data=mtcars, family=gaussian, maxit = 100) %>% 
                        step(direction = "forward", scope = .y, trace = FALSE))) %>% 
  mutate(pred = map(model, augment)) %>% 
  mutate(glance = map(model, glance))

0 个答案:

没有答案