unnest_auto和unnest_longer取消嵌套多列

时间:2019-07-11 11:49:23

标签: tidyr

我有一个嵌套的数据框,我想取消嵌套。这是一个虚假的结构示例。

df <- structure(list(`_id` = c("a", "b", "c", "d"), 
                     variable = list(structure(list(type = c("u", "a", "u", "a", "u", "a", "a"), 
                                                    m_ = c("m1",
                                                           "m2",
                                                           "m3",
                                                           "m4",
                                                           "m5",
                                                           "m6", "m7"), #omitted from original example by mistake 
                                                    t_ = c("2015-07-21 4:13 PM", 
                                                           "2016-04-21 7:25 PM", 
                                                           "2017-10-04 9:49 PM", 
                                                           "2018-12-04 12:29 PM", 
                                                           "2019-04-20 20:20 AM", 
                                                           "2016-05-20 12:00 AM", 
                                                           "2016-06-20 12:00 AM"), 
                                                    a_ = c(NA, 
                                                           "", 
                                                           NA, 
                                                           "", 
                                                           NA, 
                                                           "C", 
                                                           "C")), 
                                               class = "data.frame", 
                                               row.names = c(NA, 7L)), 
                                     structure(list(type = c("u", "a"), 
                                                    m_ = c("m1",
                                                           "m2"), 
                                                    t_ = c("2018-05-24 12:08 AM", 
                                                           "2019-04-24 3:05 PM"), 
                                                    a_ = c(NA, "")), 
                                               class = "data.frame", 
                                               row.names = 1:2), 
                                     structure(list(type = "u", 
                                                    m_ = "m1", 
                                                    t_ = "2018-02-17 3:14 PM"), 
                                               class = "data.frame", 
                                               row.names = 1L), 
                                     structure(list(type = "u", 
                                                    m_ = "m1",
                                                    t_ = "2016-05-27 5:14 PM",
                                                    b_ = "b1", 
                                                    i_ = "i1", 
                                                    e_ = structure(list(), 
                                                                   .Names = character(0), 
                                                                   class = "data.frame", 
                                                                   row.names = c(NA, -1L)), 
                                                    l_ = "l1"), 
                                               class = "data.frame", 
                                               row.names = 1L)),
                     myDate = structure(c(1521503311.992, 
                                          1521514011.161, 
                                          1551699584.65, 
                                          1553632693.94), 
                                        class = c("POSIXct", "POSIXt"))), 
                row.names = c(1L, 2L, 3L, 4L), 
                class = "data.frame")
View(df)

enter image description here

variable是长度不同的数据帧的列表(在此示例中,最大字段为7,但是可以随时间扩展)。

我尝试使用tidyr的开发版本来利用新的unnest_auto()函数。

# devtools::install_github("tidyverse/tidyr")
df2 <- unnest_auto(df, variable)
View(df2)

enter image description here

如果我在结果上使用unnest_longer并指定type之类的一列,则会对其进行扩展。

df3 <- unnest_longer(df2, type)

enter image description here

我没有看到unnest_longer()的任何处理多个列的参数。有更好的方法吗?

1 个答案:

答案 0 :(得分:0)

这似乎起作用:

df %>% unnest_auto(variable) %>% unnest()

#Warning message:
#`cols` is now required.
#Please use `cols = c(type, m_, t_, a_, e_)`

df %>% unnest_auto(variable) %>% unnest(cols = c(type, m_, t_, a_, e_, l_))

# A tibble: 11 x 10
   `_id` type  m_    t_     a_    b_    i_    e_    l_    myDate             
   <chr> <chr> <chr> <chr>  <chr> <chr> <chr> <???> <chr> <dttm>             
 1 a     u     m1    2015-… NA    NA    NA    NA    NA    2018-03-20 02:48:31
 2 a     a     m2    2016-… ""    NA    NA    NA    NA    2018-03-20 02:48:31
 3 a     u     m3    2017-… NA    NA    NA    NA    NA    2018-03-20 02:48:31
 4 a     a     m4    2018-… ""    NA    NA    NA    NA    2018-03-20 02:48:31
 5 a     u     m5    2019-… NA    NA    NA    NA    NA    2018-03-20 02:48:31
 6 a     a     m6    2016-… C     NA    NA    NA    NA    2018-03-20 02:48:31
 7 a     a     m7    2016-… C     NA    NA    NA    NA    2018-03-20 02:48:31
 8 b     u     m1    2018-… NA    NA    NA    NA    NA    2018-03-20 05:46:51
 9 b     a     m2    2019-… ""    NA    NA    NA    NA    2018-03-20 05:46:51
10 c     u     m1    2018-… NA    NA    NA    NA    NA    2019-03-04 14:39:44
11 d     u     m1    2016-… NA    b1    i1    NA    l1    2019-03-26 23:38:13