数据框列表R

时间:2019-10-24 08:04:05

标签: r

我有一个数据帧列表,其中某些列带有特殊字符-> (箭头)。现在,我确实想遍历此数据帧列表,并使用-> (箭头)定位列,然后用后缀_old和_new命名新列。这是数据帧的示例:

dput(df1)
df1 <- structure(list(v1 = c("reg->joy", "ress", "mer->dls"),
                      t2 = c("James","Jane", "Egg")),
                 class = "data.frame", row.names = c(NA,  -3L))

dput(df2)
df2 <- structure(list(v1 = c("me", "df", "kl"),
                      t2 = c("James","Jane->dlt", "Egg"),
                      t3 = c("James ->may","Jane", "Egg")),
                 class = "data.frame", row.names = c(NA,  -3L))
dput(df3)
df3 <- structure(list(v1 = c("56->34", "df23-> ", "mkl"),
                      t2 = c("James","Jane", "Egg"),
                      d3 = c("James->","Jane", "Egg")),
                 class = "data.frame", row.names = c(NA,  -3L))

这是我尝试过的

dfs <- list(df1,df2,df3)

for (y in 1:length(dfs)){
  setDT(dfs[[y]])
  df1<- lapply(names(dfs[[y]]), function(x) {
    mDT <- df2[[y]][, tstrsplit(get(x), " *-> *")]
    if (ncol(mDT) == 2L) setnames(mDT, paste0(x, c("_old", "_new")))
  }) %>% as.data.table()

}

这仅拆分一个数据帧,我需要拆分所有数据帧。 注意:我在一个数据帧上的代码划分得很好,我想要的是如何在数据帧列表上实现它

预期输出


dput(df1)
df1 <- structure(list(v1_old = c("reg", "mer"),
                      v1_new = c("joy", "dls")),
                 class = "data.frame", row.names = c(NA,  -3L))

dput(df2)
df2 <- structure(list(t2_old = c("dlt"),
                      t2_new = c("dlt"),
                      t3_old = c("James"),
                      t3_new = c("may")),
                 class = "data.frame", row.names = c(NA,  -3L))

dput(df3)
df3 <- structure(list(v1_old = c("56", "df23 "),
                      v1_new = c("34", " "),
                      d3 = c("James"),
                      d3 = c(" ")),
                 class = "data.frame", row.names = c(NA,  -3L))

1 个答案:

答案 0 :(得分:0)

我在下面添加一个使用tidyverse的解决方案。

如果列中的字符串之一包含箭头,则选择列:

col_arrow_ls <- purrr::map(dfs, ~select_if(., ~any(str_detect(., "->"))))

然后使用tidyr :: separate拆分功能。由于输出的每个元素都是一个数据帧,因此使用purrr :: map_dfc将它们按列绑定在一起:

split_df_fn <- function(df1){
  names(df1) %>%
    map_dfc(~ df1 %>% 
               select(.x) %>% 
               tidyr::separate(.x, 
                               into = paste0(.x, c("_old", "_new")), 
                               sep = "->")
    )
}

将该功能应用于数据帧列表。

purrr::map(col_arrow_ls, split_df_fn)

[[1]]
  v1_old v1_new
1    reg    joy
2   ress   <NA>
3    mer    dls

[[2]]
  t2_old t2_new t3_old t3_new
1  James   <NA> James     may
2   Jane    dlt   Jane   <NA>
3    Egg   <NA>    Egg   <NA>

[[3]]
  v1_old v1_new d3_old d3_new
1     56     34  James       
2   df23          Jane   <NA>
3    mkl   <NA>    Egg   <NA>