Question

是否可以创建tibble或data.frame，其中列为整数，其他列为tibbles或data.frames？

E.g：

library(tibble)
set.seed(1)
df.1 <- tibble(name=sample(LETTERS,20,replace = F),score=sample(1:100,20,replace = F))
df.2 <- tibble(name=sample(LETTERS,20,replace = F),score=sample(1:100,20,replace = F))

然后：

df <- tibble(id=1,rank=2,data=df.1)

给出了error：

Error: Column `data` must be a 1d atomic vector or a list

我认为df.1必须是list才能实现此目的？

Answer 1

这是你在找什么？我认为关键是每列的长度应该相同，我们需要使用list创建一个列表列来存储df.1和df.2。

df <- tibble(id = 1:2,
             rank = 2,
             data = list(df.1, df.2))
df
# # A tibble: 2 x 3
#      id  rank              data
#   <int> <dbl>            <list>
# 1     1     2 <tibble [20 x 2]>
# 2     2     2 <tibble [20 x 2]>

head(df$data[[1]])
# # A tibble: 6 x 2
#    name score
#   <chr> <int>
# 1     G    94
# 2     J    22
# 3     N    64
# 4     U    13
# 5     E    26
# 6     S    37

head(df$data[[2]])
# # A tibble: 6 x 2
#    name score
#   <chr> <int>
# 1     V    92
# 2     Q    30
# 3     S    45
# 4     M    33
# 5     L    63
# 6     Y    25

由于tibble列中每个data的结构都相同。我们可以使用tidyr::unnest来扩展tibble。

library(tidyr)
df_un <- unnest(df)
# # A tibble: 40 x 4
#       id  rank  name score
#    <int> <dbl> <chr> <int>
#  1     1     2     G    94
#  2     1     2     J    22
#  3     1     2     N    64
#  4     1     2     U    13
#  5     1     2     E    26
#  6     1     2     S    37
#  7     1     2     W     2
#  8     1     2     M    36
#  9     1     2     L    81
# 10     1     2     B    31
# # ... with 30 more rows

我们还可以nest tibble，并使用列表列将其恢复为原始格式。

library(dplyr)
df_n <- df_un %>%
  group_by(id, rank) %>%
  nest() %>%
  ungroup()
df_n
# # A tibble: 2 x 3
#        id  rank              data
#     <int> <dbl>            <list>
#   1     1     2 <tibble [20 x 2]>
#   2     2     2 <tibble [20 x 2]>

# Check if df and df_n are the same
identical(df_n, df)
# [1] TRUE

Answer 2

使用tidyr＆＃39; s nest：

set.seed(1)
df.1 <- data.frame(name=sample(LETTERS,20,replace = F),score=sample(1:100,20,replace = F))
df.2 <- data.frame(name=sample(LETTERS,20,replace = F),score=sample(1:100,20,replace = F))

我可以创建一个tibble，其中df.1嵌套在id和rank下：

library(dplyr)
library(tidyr)

data.frame(id=1,rank=2,data=df.1) %>% nest(-id,-rank)

# A tibble: 1 × 3
     id  rank              data
  <dbl> <dbl>            <list>
1     1     2 <tibble [20 × 2]>

要在df.1中同时拥有df.2和tibble，我只需这样做：

data.frame(id=c(1,2),rank=c(2,1),data=c(df.1,df.2)) %>% nest(-id,-rank)


# A tibble: 2 × 3
     id  rank              data
  <dbl> <dbl>            <list>
1     1     2 <tibble [10 × 4]>
2     2     1 <tibble [10 × 4]>

创建tibbles或数据帧和其他类的tibble或数据帧

2 个答案: