我对Purrr包中的map函数有疑问。
以mtcars数据集为例:
#I create a second df
mtcars2 <- mtcars
#change one variable just to distinguish them
mtcars2$mpg <- mtcars2$mpg / 2
#create the list
dflist <- list(mtcars,mtcars2)
#then, a simple function example
my_fun <- function(x)
{x <- x %>%
summarise(`sum of mpg` = sum(mpg),
`sum of cyl` = sum(cyl)
)
}
#then, using map, this works and prints the desired results
list_results <- map(dflist,my_fun)
但是,我需要将修改后的mtcars和mtcars2保存为r对象(数据帧)。
事先,非常感谢大家!
答案 0 :(得分:2)
这是一次尝试:
library(purrr)
library(tidyverse)
mtcars2 <- mtcars
mtcars2$mpg <- mtcars2$mpg / 2
dflist <- list(mtcars,mtcars2)
要保存对象,需要为其指定特定名称,并使用:
assign("name", object, envir = .GlobalEnv)
这是实现这一目标的一种方法:
my_fun <- function(x, list) {
listi <- list[[x]]
assign(paste0("object_from_function_", x), dflist[[x]], envir = .GlobalEnv)
x <- listi %>%
summarise(`sum of mpg` = sum(mpg),
`sum of cyl` = sum(cyl)
)
return(x)
}
my_fun
有两个参数 - seq_along(list)
用于生成特定名称,list
用于处理
这会保存两个对象object_from_function_1
和object_from_function_2
:
list_results <- map(seq_along(dflist), my_fun, dflist)
另一种方法是在地图函数之外使用list2env
作为akrun建议
dflist <- list(mtcars,mtcars2)
names(dflist) <- c("mtcars","mtcars2")
list2env(dflist, envir = .GlobalEnv) #this will create two objects `mtcars` and `mtcars2`
并在您创建对象后运行map
。
答案 1 :(得分:0)
这是将 purrr::walk()
与 get()
和 assign()
结合使用的解决方案。与上述类似,但不完全相同。
library(dplyr)
library(purrr)
data(mtcars)
创建第二个数据框。
mtcars2 <- mtcars
mtcars2$mpg <- mtcars2$mpg / 2
创建应用于每个数据框的函数。
sum_mpg_cyl <- function(.data) {
.data %>%
summarise(
`sum of mpg` = sum(mpg),
`sum of cyl` = sum(cyl)
)
}
将sum_mpg_cyl()
应用到mtcars
和mtcars2
,将两个同名的summary stats数据框保存到全局环境中。这种方法的一个潜在优势是您不需要创建单独的数据框列表。
walk(
.x = c("mtcars", "mtcars2"),
.f = function(df_name) {
# Get the data frame from the global environment
df <- get(df_name, envir = .GlobalEnv)
# Calculate the summary statistics
df <- sum_mpg_cyl(df)
# Save the data frames containing summary statistics back to the global
# environment
assign(df_name, df, envir = .GlobalEnv)
}
)
我可能还会使用匿名函数并使用不同的名称保存汇总统计数据的两个数据框,如下所示:
# Reset the data
data(mtcars)
mtcars2 <- mtcars
mtcars2$mpg <- mtcars2$mpg / 2
walk(
.x = c("mtcars", "mtcars2"),
.f = function(df_name) {
# Get the data frame from the global environment
df <- get(df_name, envir = .GlobalEnv)
# Calculate the summary statistics
df <- df %>%
summarise(
`sum of mpg` = sum(mpg),
`sum of cyl` = sum(cyl)
)
# Rename the data frames containing summary statistics to distinguish
# them from the input data frames
new_df_name <- paste(df_name, "stats", sep = "_")
# Save the data frames containing summary statistics back to the global
# environment
assign(new_df_name, df, envir = .GlobalEnv)
}
)