如何使用tidyverse将相应的文件名添加为新列

时间:2018-12-16 08:52:47

标签: r tidyverse

关于此OP,我们可以使用以下代码将行与相应的数据帧绑定:

options(stringsAsFactors = FALSE)
games <- data.frame(index = c(1,2,3), player = c('John', 'Sam', 'Mary'))
weather <- data.frame(index = c(1,2,3), temperature = c('hot', 'cold', 'rainy'))
list1 <- list(games = games, weather = weather)

games <- list()
weather <- data.frame(index = c(1,2,3), temperature = c('cold', 'rainy', 'hot'))
cars <- data.frame(index = c(1,2,3), car = c('honda', 'toyota','bmw'))
list2 <- list(games = games, weather = weather, cars = cars)

games <- data.frame(index = c(1,2,3), player = c('Peter', 'Kevin', 'Mary'))
weather <- list()
list3 <- list(games = games, weather = weather)

all_list <- list(list1, list2, list3)
all_names <- all_list %>% map(names) %>% reduce(union)
list(list1, list2, list3) %>%
  transpose(.names = all_names) %>%
  map(dplyr::bind_rows)

返回

 $`games`
      index player
    1     1   John
    2     2    Sam
    3     3   Mary
    4     1  Peter
    5     2  Kevin
    6     3   Mary

    $weather
      index temperature
    1     1         hot
    2     2        cold
    3     3       rainy
    4     1        cold
    5     2       rainy
    6     3         hot

    $cars
      index    car
    1     1  honda
    2     2 toyota
    3     3    bmw

假设列表是从不同文件导入的,例如list1,list2和list3分别来自1001.csv,2005.csv和3009.csv

如果我想添加与filename相对应的列,以便返回如下:

$`games`
  index player userid
1     1   John   1001
2     2    Sam   1001
3     3   Mary   1001
4     1  Peter   3009
5     2  Kevin   3009
6     3   Mary   3009

$weather
  index temperature userid
1     1         hot   1001
2     2        cold   1001
3     3       rainy   1001
4     1        cold   2005
5     2       rainy   2005
6     3         hot   2005

$cars
  index    car userid
1     1  honda   2005
2     2 toyota   2005
3     3    bmw   2005

我尝试使用,但是没有成功。

  file.nm <- gsub(".csv", "",file.list)
  list(list1, list2, list3) %>% map(function(x) bind_cols(x, file.nm)) %>% 
     transpose(.names = all_names) %>%
     map(dplyr::bind_rows)

其中file.list是一个包含文件名(.csv)的向量

能给我建议吗?

1 个答案:

答案 0 :(得分:0)

您的方法有两个问题:x是数据帧列表,而不是单个数据帧; file.nm是名称的向量,而不是单个名称; x的某些元素的长度可能为零,在这种情况下不应该绑定。

我们可能会使用

files.nm <- c("1001", "2005", "3009")
list(list1, list2, list3) %>% 
  map2(files.nm, ~ map(.x[lengths(.x) > 0], cbind, userid = .y)) %>%
  transpose(.names = all_names) %>% map(dplyr::bind_rows)
$games
  index player userid
1     1   John   1001
2     2    Sam   1001
3     3   Mary   1001
4     1  Peter   3009
5     2  Kevin   3009
6     3   Mary   3009

$weather
  index temperature userid
1     1         hot   1001
2     2        cold   1001
3     3       rainy   1001
4     1        cold   2005
5     2       rainy   2005
6     3         hot   2005

$cars
  index    car userid
1     1  honda   2005
2     2 toyota   2005
3     3    bmw   2005