我试图在多个数据上映射多个模型,并将结果存储在类似于嵌套小标题或多个列表的内容中。我想将模型应用于同一管道。我运行以下命令:
data(iris)
df <- iris %>%
filter(Species != "setosa") %>%
mutate(Species = +(Species == "virginica"))
var_combos <- expand.grid(colnames(df[,1:4]), colnames(df[,1:4])) %>%
filter(!Var1 == Var2)
map2(
.x = var_combos$Var1,
.y = var_combos$Var2,
~select(df, .x, .y) %>%
mutate(
Species = df$Species
)
) %>%
map(., ~glm(Species ~ ., data = ., family = binomial(link='logit')))
这为我提供了一个不错的地图物流模型。如何将这个模型存储在嵌套的小标题或列表中,然后mutate
并在其旁边添加更多要存储的模型,例如:
... %>%
map(., ~glm(Species ~ ., data = ., family = binomial(link='logit'))) %>%
map(., e1071::svm(Species ~ ., data = ., kernel = "polynomial"))
答案 0 :(得分:1)
在通过map2
遍历“ var_combos”列的元素之后,nest
通过创建一个虚拟列,map
在“ data”之上,然后创建{{ 1}}模型作为新列
list
检查“模型”
library(purrr)
library(dplyr)
out1 <- map2(
var_combos$Var1,
var_combos$Var2, ~
df %>%
select(Species, .x, .y) %>%
group_by(grp = 'grp') %>%
nest %>%
mutate(models = map(data, ~ {
list(glm(Species ~ ., data = .x, family = binomial(link='logit')),
e1071::svm(Species ~ ., data = .x, kernel = "polynomial") )
})))
out1[1:3]
#[[1]]
# A tibble: 1 x 3
# Groups: grp [1]
# grp data models
# <chr> <list> <list>
#1 grp <tibble [100 × 3]> <list [2]>
#[[2]]
# A tibble: 1 x 3
# Groups: grp [1]
# grp data models
# <chr> <list> <list>
#1 grp <tibble [100 × 3]> <list [2]>
#[[3]]
# A tibble: 1 x 3
# Groups: grp [1]
# grp data models
# <chr> <list> <list>
#1 grp <tibble [100 × 3]> <list [2]>
嵌套的原因是避免存储模型,该模型不必要地使用out1[[1]]$models
#[[1]]
#[[1]][[1]]
#Call: glm(formula = Species ~ ., family = binomial(link = "logit"),
data = .x)
#Coefficients:
# (Intercept) Sepal.Width Sepal.Length
# -13.0460 0.4047 1.9024
#Degrees of Freedom: 99 Total (i.e. Null); 97 Residual
#Null Deviance: 138.6
#Residual Deviance: 110.3 AIC: 116.3
#[[1]][[2]]
#Call:
#svm(formula = Species ~ ., data = .x, kernel = "polynomial")
#Parameters:
# SVM-Type: eps-regression
# SVM-Kernel: polynomial
# cost: 1
# degree: 3
# gamma: 0.5
# coef.0: 0
# epsilon: 0.1
#Number of Support Vectors: 98
重复“数据”的每一行。在这里,mutate
是data
,我们随时可以list
将其设置为“长”格式
unnest
现在,将看到'model' library(tidyr)
out1 %>%
map(~ .x %>%
unnest(c(data)))
对于每一行都得到重复。因此,最好将其存储在list
列中,甚至提取“模型”作为单独的数据集
如果我们想list
'模型'
flatten