将结果映射并存储在嵌套小标题中

时间:2020-02-09 19:29:13

标签: r purrr

我试图在多个数据上映射多个模型,并将结果存储在类似于嵌套小标题或多个列表的内容中。我想将模型应用于同一管道。我运行以下命令:

data(iris)
df <- iris %>% 
  filter(Species != "setosa") %>% 
  mutate(Species = +(Species == "virginica"))

var_combos <- expand.grid(colnames(df[,1:4]), colnames(df[,1:4])) %>% 
  filter(!Var1 == Var2)

map2(
  .x = var_combos$Var1,
  .y = var_combos$Var2,
  ~select(df, .x, .y) %>% 
    mutate(
      Species = df$Species
    )
) %>%
  map(., ~glm(Species ~ ., data = ., family = binomial(link='logit')))

这为我提供了一个不错的地图物流模型。如何将这个模型存储在嵌套的小标题或列表中,然后mutate并在其旁边添加更多要存储的模型,例如:

 ...   %>%
      map(., ~glm(Species ~ ., data = ., family = binomial(link='logit'))) %>% 
      map(., e1071::svm(Species ~ ., data = ., kernel = "polynomial"))

1 个答案:

答案 0 :(得分:1)

在通过map2遍历“ var_combos”列的元素之后,nest通过创建一个虚拟列,map在“ data”之上,然后创建{{ 1}}模型作为新列

list

检查“模型”

library(purrr)
library(dplyr)    
out1 <- map2(
     var_combos$Var1,
     var_combos$Var2, ~  
       df %>%
           select(Species, .x, .y) %>%
           group_by(grp = 'grp') %>% 
           nest %>%
           mutate(models = map(data, ~ { 
           list(glm(Species ~ ., data = .x, family = binomial(link='logit')),
                e1071::svm(Species ~ ., data = .x, kernel = "polynomial") )
      })))
out1[1:3]
#[[1]]
# A tibble: 1 x 3
# Groups:   grp [1]
#  grp   data               models    
#  <chr> <list>             <list>    
#1 grp   <tibble [100 × 3]> <list [2]>

#[[2]]
# A tibble: 1 x 3
# Groups:   grp [1]
#  grp   data               models    
#  <chr> <list>             <list>    
#1 grp   <tibble [100 × 3]> <list [2]>

#[[3]]
# A tibble: 1 x 3
# Groups:   grp [1]
#  grp   data               models    
#  <chr> <list>             <list>    
#1 grp   <tibble [100 × 3]> <list [2]>

嵌套的原因是避免存储模型,该模型不必要地使用out1[[1]]$models #[[1]] #[[1]][[1]] #Call: glm(formula = Species ~ ., family = binomial(link = "logit"), data = .x) #Coefficients: # (Intercept) Sepal.Width Sepal.Length # -13.0460 0.4047 1.9024 #Degrees of Freedom: 99 Total (i.e. Null); 97 Residual #Null Deviance: 138.6 #Residual Deviance: 110.3 AIC: 116.3 #[[1]][[2]] #Call: #svm(formula = Species ~ ., data = .x, kernel = "polynomial") #Parameters: # SVM-Type: eps-regression # SVM-Kernel: polynomial # cost: 1 # degree: 3 # gamma: 0.5 # coef.0: 0 # epsilon: 0.1 #Number of Support Vectors: 98 重复“数据”的每一行。在这里,mutatedata,我们随时可以list将其设置为“长”格式

unnest

现在,将看到'model' library(tidyr) out1 %>% map(~ .x %>% unnest(c(data))) 对于每一行都得到重复。因此,最好将其存储在list列中,甚至提取“模型”作为单独的数据集

更新

如果我们想list'模型'

flatten