通过唯一标识符

时间:2018-07-26 19:40:05

标签: r apply

我有以下数据:

ID<-c(001,002,003,003,004,005)
Email<-c("tom@abc.com","jane@abc.com","jim@abc.com","jim@abc.com","tom@abc.com","mike@abc.com")
df<-as.data.frame(cbind(ID,Email))

我想创建一个表格,其中每个人的电子邮件地址的ID号都将以表格格式显示。

Email           IDs
jane@abc.com    002
jim@abc.com     003
mike@abc.com    005
tom@abc.com     001, 004

我尝试了一个Apply函数tapply(df$ID,df$Email, FUN=length,但只得到了非唯一计数。

jane@abc.com    1
jim@abc.com     2
mike@abc.com    1
tom@abc.com     2

1 个答案:

答案 0 :(得分:3)

使用data.table,这很简单:

df <- data.frame(
    id = c("001","002","003","003","004","005"),
    email = c("tom@abc.com","jane@abc.com","jim@abc.com","jim@abc.com","tom@abc.com","mike@abc.com"),
    stringsAsFactors = FALSE
)

library(data.table)
setDT(df)
df[ , .(idlist = paste(unique(id), collapse = ", ")), by = email]