不规则的巢穴整齐

时间:2018-03-16 10:57:58

标签: r nested tidyr

我有以下问题。我想按日期嵌套时间序列,但也包括x个早期日期。小例子将清除这一个:

让我们创建示例tbl:

set.seed(13)
tibble(date = c(rep("2018-01-31", 3), rep("2018-02-28", 3), rep("2018-03-31", 3), rep("2018-04-30", 3)),
       form = rep(c("A", "B", "C"), 4),
       value = rnorm(n = 12),
       ind = runif(12)) -> tbl

让它嵌套:

tbl %>% 
  nest(-date)

# A tibble: 4 x 2
  date       data            
  <chr>      <list>          
1 2018-01-31 <tibble [3 x 3]>
2 2018-02-28 <tibble [3 x 3]>
3 2018-03-31 <tibble [3 x 3]>
4 2018-04-30 <tibble [3 x 3]>

我喜欢这种格式的数据结构(我讨厌普通列表)。我想有以下内容:

# A tibble: 4 x 2
  date       data            
  <chr>      <list>          
1 2018-01-31 <NA>
2 2018-02-28 <tibble [6 x 3]>
3 2018-03-31 <tibble [6 x 3]>
4 2018-04-30 <tibble [6 x 3]>

行2018-02-28中的数据包括Jan和Feb数据,2018-03-31行包括Feb和Mar数据,依此类推。灵活的解决方案,所以我可以说包括多少以前的时期将是很好的结果。

2 个答案:

答案 0 :(得分:1)

这是我的想法,似乎在起作用。感谢Axeman,给我这个想法。

辅助功能:

bind_roll <- rollify(~dplyr::bind_rows(.), window = 3, unlist = FALSE)

tbl %>% 
  nest(-date) %>% 
  mutate(data2 = bind_roll(data))

# A tibble: 4 x 3
  date       data             data2           
  <chr>      <list>           <list>          
1 2018-01-31 <tibble [3 x 3]> <lgl [1]>       
2 2018-02-28 <tibble [3 x 3]> <lgl [1]>       
3 2018-03-31 <tibble [3 x 3]> <tibble [9 x 3]>
4 2018-04-30 <tibble [3 x 3]> <tibble [9 x 3]>

答案 1 :(得分:0)

通过绑定表格,两个场景相对容易:

tbl %>% 
  nest(-date) %>% 
  mutate(data2 = map2(data, lag(data), ~safely(bind_rows, otherwise = NA)(.y, .x)$result))
# A tibble: 4 x 3
  date       data             data2           
  <chr>      <list>           <list>          
1 2018-01-31 <tibble [3 x 3]> <lgl [1]>       
2 2018-02-28 <tibble [3 x 3]> <tibble [6 x 3]>
3 2018-03-31 <tibble [3 x 3]> <tibble [6 x 3]>
4 2018-04-30 <tibble [3 x 3]> <tibble [6 x 3]>