如何用嵌套数据框整理数据?

时间:2018-02-09 23:13:33

标签: r functional-programming tidyr tidyverse purrr

我希望整理一个嵌套的数据框架,但我遇到了一些困难。我可以使用一个案例重新整理数据,但我希望逐个迭代整个数据。

我的数据如下:

df <- tibble(
          case = c("a","a","b","b","c","c"),
          year = c(1990,2000,1990,2000,1990,2000),
          var1 = round(runif(6,0,1), 2),
          var2 = round(runif(6,10,20), 2)

)

我可以使用tidyr

执行我想要的任务
 df %>% 
  filter( case == "a") %>%
  gather(var, value, -c(1:2)) %>%
  spread(year, value)

输出:

#      case  var  `1990` `2000`
#     <chr> <chr>  <dbl>  <dbl>
#    1 a     var1   0.850  0.540
#    2 a     var2  14.4   16.7  

如何使用purrr或其他函数式编程工具来矢量化此操作并对我的所有情况执行相同的操作并将它们绑定到一个数据框中?某些组合&#34; nest&#34;和&#34;地图&#34;?

谢谢!

2 个答案:

答案 0 :(得分:3)

不要收集case列。

set.seed(1234)

df <- tibble(
  case = c("a","a","b","b","c","c"),
  year = c(1990,2000,1990,2000,1990,2000),
  var1 = round(runif(6,0,1), 2),
  var2 = round(runif(6,10,20), 2)
)

library(tidyverse)

df %>% 
  gather(var, value, -c(1:2)) %>%
  spread(year, value)
# # A tibble: 6 x 4
#   case  var   `1990` `2000`
#   <chr> <chr>  <dbl>  <dbl>
# 1 a     var1   0.110  0.620
# 2 a     var2  10.1   12.3  
# 3 b     var1   0.610  0.620
# 4 b     var2  16.7   15.1  
# 5 c     var1   0.860  0.640
# 6 c     var2  16.9   15.4  

答案 1 :(得分:0)

另一个选项可能是使用dcast来自&#39; reshape2 package. But 1st it we need to gather var1 and var2gather

    library(tidyverse)
    library(reshape2)
    set.seed(1234)
    df <- tibble(
      case = c("a","a","b","b","c","c"),
      year = c(1990,2000,1990,2000,1990,2000),
      var1 = round(runif(6,0,1), 2),
      var2 = round(runif(6,10,20), 2)

    )
    # User gather to combine var1 and var2 and then apply dcast
    gather(df, var, val, var1:var2) %>% dcast(case+var ~ year, value.var = "val")
  # Result
  #    case  var  1990  2000
  #  1    a var1  0.11  0.62
  #  2    a var2 10.09 12.33
  #  3    b var1  0.61  0.62
  #  4    b var2 16.66 15.14
  #  5    c var1  0.86  0.64
  #  6    c var2 16.94 15.45