Question

我正在使用多个列表，每个列表包含大量数据帧。每个数据框包含3个变量（cluster，grp，value），例如（1个列表的示例）

$`0`
Source: local data frame [1 x 3]

  cluster   grp               value
    (int) (int)               (chr)
1       1     0 c Personal Care-277

$`1`
Source: local data frame [1 x 3]

  cluster   grp      value
    (int) (int)      (chr)
1       1     1 b Unpaid-1

$`2`
Source: local data frame [1 x 3]

  cluster   grp             value
    (int) (int)             (chr)
1       1     2 c Personal Care-1

我想要的是在向量中总结这些信息，以便轻松分析它们[输出想要的]：

cluster 1 : (c Personal Care-277) - (b Unpaid-1) - (c Personal Care-1)

我试图做的是：

library(plyr)
library(dplyr)

1）我首先通过cluster将所有数据框合并在一起。我选择使用join_all，除了奇怪的colname输出外，这似乎对这项工作没有问题。

dt1 = dt %>% lapply(fgr) %>% 
  join_all(by = 'cluster') %>% 
  `colnames<-`(c("cluster", paste('t', 1:3, sep = '')))

2）然后我使用paste将值以风格化的方式放在一起

dt1 %>% 
  mutate(print = paste('cluster: ', cluster, ' (' , t1, ')', '(', t2 , ')', '(',    t3 , ')', sep="") ) %>% 
  select(print)

#                                                             print
# 1 cluster: 1 (c Personal Care-277)(b Unpaid-1)(c Personal Care-1)

问题在于我有许多不同的列表，其中包含许多数据帧，而某些数据帧具有不等length。此处示例中的列表包含3个元素t1 t2 t3（加上cluster）。但是一些列表具有包含4个或更多元素的数据帧。

问题

我想首先了解是否有办法让这个paste自动化，以避免手写<{1>}，t1等等 >其次，如果你比我在这里展示的那个更了解常规。

谢谢

数据（列表）

t2

Answer 1

你可以尝试，

library(dplyr)
bind_rows(dt) %>% 
        group_by(cluster) %>% 
        summarise(new = paste0('cluster: ', unique(cluster), ' (', paste(value, collapse = ','), ')')) %>% 
        select(new)

# A tibble: 1 × 1
#                                                            new
#                                                          <chr>
#1 cluster: 1 (c Personal Care-277,b Unpaid-1,c Personal Care-1)

Answer 2

我们也可以使用rbindlist

中的data.table

library(data.table)
rbindlist(dt)[, sprintf("cluster: %s (%s)", unique(cluster), 
        paste(unique(value), collapse=')(')),  by = cluster]$V1
#[1] "cluster: 1 (c Personal Care-277)(b Unpaid-1)(c Personal Care-1)"

R - 粘贴列表

2 个答案: