R - 制表字符向量 - 自定义输出

时间:2016-06-21 19:04:55

标签: r

我希望将某些小字符向量的值制成表格,并将制表结果附加到字符串中。对于下面可重现的示例,我想要的输出看起来像这样:

  states                 responsible
1     KS             Joe(2);Suzie(3)
2     MO                      Bob(4)
3     CO    Suzie(1);Bob(2);Ralph(3)
4     NE                      Joe(1)
5     MT           Suzie(3);Ralph(1)

以下是示例数据:

states <- c("KS", "MO", "CO", "NE", "MT")
responsible <- list(c("Joe", "Joe", "Suzie", "Suzie", "Suzie"), c("Bob", "Bob", "Bob", "Bob"), c("Suzie", "Bob", "Ralph", "Ralph", "Bob", "Ralph"), "Joe", c("Suzie", "Ralph", "Suzie", "Suzie"))

df <- as.data.frame(cbind(states, responsible))

#Tabulating using table()
resp.tab <- lapply(responsible, table)

#Is there a way I can do tabulation without converting to factors?
# OR
#Is there a way to access the factor label and value, then paste them together? 

1 个答案:

答案 0 :(得分:2)

我们可以使用data.table。通过使用'{1}}'责任'和data.table'责任'复制'状态'来创建lengths

unlist

按“状态”和“负责任”分组,我们得到频率(library(data.table) dt1 <- data.table(states= rep(states, lengths(responsible)), responsible=unlist(responsible)) ),然后按“状态”分组,我们.N“负责”和“N”列和{ {1}}行属于相同的“状态”。

paste

或类似的选项是collapse

dt1[,  .N, .(states, responsible)
  ][,  .(responsible = paste(paste0(responsible, 
                   "(", N, ")"), collapse=";")) ,.(states)]
#  states              responsible
#1:     KS          Joe(2);Suzie(3)
#2:     MO                   Bob(4)
#3:     CO Suzie(1);Bob(2);Ralph(3)
#4:     NE                   Joe(1)
#5:     MT        Suzie(3);Ralph(1)