我有许多数据帧的列表(survey08
,survey09
,
survey10
等)称为df_list
。
每个数据框包含2列,分别名为year
和employed
。
# create 3 dataframes with identical column names
survey08 <- data.frame(year = 2008, employed = c(1, 2, 2, 1, 2))
survey09 <- data.frame(year = 2009, employed = c(1, 1, 1, 2, 1))
survey10 <- data.frame(year = 2010, employed = c(2, 1, 1, 1, 1))
# put dataframes into a list
df_list <- list(survey08, survey09, survey10)
# add names for dataframes in list
# names correspond to survey year ('year' column)
names(df_list) <- c("survey08", "survey09", "survey10")
我想重新编码employed
列中的值(1 =是,2 =否),但仅对survey08
和survey09
数据帧中的值进行编码。对于列表中的其他数据框,我想保留原始列值(即,仅修改列表中的特定DF)。
我使用year
列作为过滤器尝试了以下代码:
library(tidyverse)
# modify only values in 'employed' column for DFs 'survey08' and 'survey09'
# use 'year' column as filter
df_list %>%
map(~filter(.x, year %in% 2008:2009)) %>%
map(~ .x %>% mutate_at(vars(employed), ~recode_factor(.,`1` = "yes", `2` = "no")))
尽管这正确地重新编码了两个数据帧(survey08
和survey09
),但它并未保留列表中其他数据帧的值。
当前输出:
#> $survey08
#> year employed
#> 1 2008 yes
#> 2 2008 no
#> 3 2008 no
#> 4 2008 yes
#> 5 2008 no
#>
#> $survey09
#> year employed
#> 1 2009 yes
#> 2 2009 yes
#> 3 2009 yes
#> 4 2009 no
#> 5 2009 yes
#>
#> $survey10
#> [1] year employed
#> <0 rows> (or 0-length row.names)
所需的输出:
$survey08
year employed
1 2008 yes
2 2008 no
3 2008 no
4 2008 yes
5 2008 no
$survey09
year employed
1 2009 yes
2 2009 yes
3 2009 yes
4 2009 no
5 2009 yes
$survey10
year employed
1 2010 2
2 2010 1
3 2010 1
4 2010 1
5 2010 1
由reprex package(v0.3.0)于2019-08-24创建
答案 0 :(得分:2)
您可以使用68
来仅修改由名称或位置指定的元素。
purrr::map_at
答案 1 :(得分:1)
使用filter
将删除您要保留的其他data.frame。您需要map_if
而不是map
。然后,您可以使用.p
来标识要执行地图功能的项目。
df_list %>%
map_if(.,
.f = ~ .x %>% mutate_at(vars(employed), ~recode_factor(.,`1` = "yes", `2` = "no")),
.p = c(T,T,F))
或
df_list %>%
map_if(.,
.f = ~ .x %>% mutate_at(vars(employed), ~recode_factor(.,`1` = "yes", `2` = "no")),
.p = ~ .x %>% pull(year) %>% unique(.) %in% 2008:2009)
答案 2 :(得分:1)
使用lapply
和用户定义函数来评估year
是否小于2010
的基本R解决方案。
df_list2 <- lapply(df_list, function(x){
if (unique(x$year) < 2010){
x$employed <- as.character(factor(x$employed, levels = c(1, 2), labels = c("yes", "no")))
}
return(x)
})
df_list2
# $survey08
# year employed
# 1 2008 yes
# 2 2008 no
# 3 2008 no
# 4 2008 yes
# 5 2008 no
#
# $survey09
# year employed
# 1 2009 yes
# 2 2009 yes
# 3 2009 yes
# 4 2009 no
# 5 2009 yes
#
# $survey10
# year employed
# 1 2010 2
# 2 2010 1
# 3 2010 1
# 4 2010 1
# 5 2010 1
答案 3 :(得分:0)
如果您已经知道要执行哪个列表,为什么不只将其子集并重新编码。
library(tidyverse)
df_list[c("survey08", "survey09")] <- df_list[c("survey08", "survey09")] %>%
map(~ .x %>% mutate_at(vars(employed), ~recode_factor(.,`1` = "yes", `2` = "no")))
df_list
#$survey08
# year employed
#1 2008 yes
#2 2008 no
#3 2008 no
#4 2008 yes
#5 2008 no
#$survey09
# year employed
#1 2009 yes
#2 2009 yes
#3 2009 yes
#4 2009 no
#5 2009 yes
#$survey10
# year employed
#1 2010 2
#2 2010 1
#3 2010 1
#4 2010 1
#5 2010 1