这是数据
DPS Comodity Std Issue
111 Hard drive No Post
111 MBD NoBoot
111 LCD Flicker
222 MBD No Post
222 LCD No Post
333 MBD No power
我必须使用以下格式
DPS Comodity Std Issue
111 Hard drive,MBD,LCD Hard drive-No Post,MBD-NoBoot,LCD-Flicker
222 MBD,LCD No Post
333 MBD No Power
我尝试过aggregate(Std Issue~DPS,df,function(x)toString(uniqe(x)))
,但结果是Std Issue为
No Post,No Boot, Flicker
No Post
No Power
这不是我所要求的,关于解决此类问题的任何建议将非常有帮助和赞赏。
aggregate(Std Issue~DPS,df,function(x)toString(uniqe(x)))
或
这是预期的结果
DPS Comodity Std Issue
111 Hard drive,MBD,LCD Hard drive-No Post,MBD-NoBoot,LCD-Flicker
222 MBD,LCD No Post
333 MBD No Power
答案 0 :(得分:1)
您可以使用data.table
软件包-
> library(data.table)
> setDT(dt)[,Std_Issue:=paste0(Comodity,"-",Std.Issue)]
> setDT(dt)[, list(Comodity = paste(Comodity, collapse=","),
`Std Issue` = paste(Std_Issue, collapse=",")), by = DPS]
输出-
DPS Comodity Std Issue
1: 111 Hard drive,MBD,LCD Hard drive-No Post,MBD-NoBoot,LCD-Flicker
2: 222 MBD,LCD MBD-No Post,LCD-No Post
3: 333 MBD MBD-No power
输入数据-
dt <- read.table(text="DPS Comodity Std Issue
111 Hard drive No Post
111 MBD NoBoot
111 LCD Flicker
222 MBD No Post
222 LCD No Post
333 MBD No power",header=T,sep="\t")
已编辑
您可以使用for loop
-
> setDT(dt)[,Std_Issue:=paste0(Comodity,"-",Std.Issue)]
> setDT(dt)[, list(Std_issue = ifelse(length(unlist(unique(lapply(str_split(Std_Issue,"-"),function(x)x[2]))))<3,paste(unique(`Std.Issue`), collapse=","),paste(Std_Issue, collapse=",")),Commodity=paste(Comodity, collapse=",")), by=DPS]
DPS Std_issue Commodity
1: 111 Hard drive-No Post,MBD-NoBoot,LCD-Flicker Hard drive,MBD,LCD
2: 222 No Post MBD,LCD
3: 333 No power MBD
答案 1 :(得分:0)
我们可以使用dplyr
应用于两个列,即
library(dplyr)
df %>%
group_by(DPS) %>%
summarise_all(funs(toString(unique(.))))
给出,
# A tibble: 3 x 3 DPS Comodity Std_Issue <int> <chr> <chr> 1 111 Hard_drive, MBD, LCD No_Post, NoBoot, Flicker 2 222 MBD, LCD No_Post 3 333 MBD No_power
答案 2 :(得分:0)
最后,我找到了可行的解决方案:
test_df <- data.frame(DPS=c(111,111,111,222,222,333),comodity =c("HDD","MBD","LCD","MBD","LCD","MBD"),stdIss=c("No Post","No Boot","Flicker","No Post","No Post","No Power"))
A <- data.frame(tapply(test_df$comodity,test_df$DPS,FUN = function(x){toString(x)}))
B <- data.frame(tapply(test_df$stdIss,test_df$DPS,FUN=function(x{toString(unique(x))}))
C <- data.frame(A,B)
colnames(C)[1] <- "comodity"
colnames(C)[2] <- "Std Issue"
C$comodity <- strsplit(C$comodity, split = ",")
C$`Std Issue` <- strsplit(C$`Std Issue`,split = ",")
C$new <- NA
D <- list()
for(i in 1:nrow(C)){
if(length(C$`Std Issue`[[i]])>1){for(j in 1:length(C$`Std Issue`[[i]]))
{
D[j]<- paste(C$comodity[[i]][j],C$`Std Issue`[[i]][j],sep = "-")
}
C$new[i]<-paste(D,collapse = ",")
}
else
{
C$new[i] <-paste(C$`Std Issue`[i])
}
}