在r的数据帧列表中查找名称包含“ gender”的所有列并对其进行子集化?

时间:2018-08-15 17:52:46

标签: r list select nested subset

我有一个嵌套列表,它基本上是一个列表,其中包含一堆数据框。其中一些数据框具有“性别”列,我想提取所有这些列并基本上将它们绑定在一起,因此我有一个长的带有性别的向量或数据框(还有种族列),并且可能还有其他一些变量。我希望在tidyverse中使用select,contains和something可以工作,但是我认为我不能做到这一点。

这里有一些模拟数据和场景(尽管实际上我有多个包含数千个具有许多变量的dfs的列表)

df1 <- data.frame("name" = c("John","mia"), "race" = c("asian","black"), "gender" = c("male","female"))
df2 <- data.frame("name" = c("nick","kate"), "race" = c("asian","white"), "gender" = c("male","female"))
df3 <- data.frame("name" = c("ed","may"), "race" = c("black","white"), "gender" = c("male","female"))
df4 <- data.frame("name" = c("Jo","mary"), "race" = c("white","black"), "gender" = c("male","female"))

lista <- c(df1, df2)
listb <- c(df3, df4)

我想要类似的东西

newdf <- data.frame("race" = c("asian", "black", "asian", "white", "black","white", "white","black"), "gender" = c("male","female", "male","female", "male","female", "male","female")

我希望我可以做类似的事情:

list %>% select(df) %>% select(contains("race"), contains("gender") %>% 
         bind_rows(.)

但是显然,我不能那样做。

2 个答案:

答案 0 :(得分:0)

mylist <- list(length=3)
mylist[[1]] <- data.frame(gender = c("male", "female"), num=1:2, stringsAsFactors = F)
mylist[[2]] <- data.frame(gender = c("female", "male"), var=3:4, stringsAsFactors = F)
mylist[[3]] <- data.frame(something=letters[1:5], stringsAsFactors = F)


unlist(lapply(mylist, function(x) x$gender))

答案 1 :(得分:0)

有一个数据示例很好,但这是开始:

library(purrr)
library(dplyr)

map_dfr(lst_of_dataframes, ~select(.x, contains('gender')))

对于您构成的情况,创建tidyverse的天才让您涵盖了:

map_dfr(lst_of_dataframes, ~select(.x, matches('gender|race')))