如何通过以某种方式在不同节点上发送xs和ys的组合来并行化计算,从而使此代码与SparkR的spark.lapply一起工作?我相信,它应该在以下代码的tibble(ys, xs)
和mutate(startformula = paste0(ys, " ~ ", 1)
之间开始,但是我不知道如何。
library(tidyverse)
library(broom)
ys <- names(mtcars)
xs <- map(ys, ~setdiff(names(mtcars), .x)) %>%
map(~paste0(.x, collapse = "+")) %>%
unlist()
models <- tibble(ys, xs) %>%
mutate(startformula = paste0(ys, " ~ ", 1),
endformula = paste0(ys, " ~ ", xs)) %>%
mutate(model = map2(startformula,
endformula,
~glm(.x, data=mtcars, family=gaussian, maxit = 100) %>%
step(direction = "forward", scope = .y, trace = FALSE))) %>%
mutate(pred = map(model, augment)) %>%
mutate(glance = map(model, glance))