数据就像这样
1. CM-00063262-15 EARRINGS
2. CM-00063262-15 EARRINGS
3. CM-00063262-15 NECKLACE
4. CM-00063262-15 WALLET-WOMEN'S
5. CM-00063263-15 SLACKS
6. CM-00063264-15 BATH TUB
7. CM-00063264-15 GIFT SET
我想要这样的输出
1. CM-00063262-15 EARRINGS,EARRINGS,NECKLACE,WALLET-WOMEN'S
2. CM-00063263-15 SLACKS
3. CM-00063264-15 BATH TUB,GIFT SET
提前谢谢
答案 0 :(得分:0)
我们需要提取账单编号并将其用作分组变量
library(data.table)
setDT(df1)[, toString(unique(sub("\\S+", "", Col))),
by = .(grp = sub("\\s+.*", "", Col))]
# grp V1
#1: CM-00063262-15 EARRINGS, NECKLACE, WALLET-WOMEN'S
#2: CM-00063263-15 SLACKS
#3: CM-00063264-15 BATH TUB, GIFT SET
如果OP的数据集有两列而不是一列,则更容易
setDT(df1)[, toString(unique(Col2)), by = Col1]
df1 <- structure(list(Col = c("CM-00063262-15 EARRINGS",
"CM-00063262-15 EARRINGS",
"CM-00063262-15 NECKLACE", "CM-00063262-15 WALLET-WOMEN'S", "CM-00063263-15 SLACKS",
"CM-00063264-15 BATH TUB", "CM-00063264-15 GIFT SET")),
.Names = "Col", class = "data.frame", row.names = c(NA, -7L))
答案 1 :(得分:0)
使用此
aggregate(data=df,V2~V1,FUN=paste)