我有一个数据集如下
From To Weight
A B 3
A C 4
A D 1
A E 5
A J 8
B C 0.5
B E 2.5
B L 3
B M 5
我需要在来自中的每个值生成到中的4/3(用户定义)值的组合,权重是组合的平均值。这是否可能在R
最终数据集如下所示。
From To1 To2 To3 To4 Weight(avg)
A B C D E 3.25
A B C D J 2
A B C E J 5
A .................................
A .................................
B C E J M 2.75
答案 0 :(得分:-1)
你可以尝试
# the data
d <- structure(list(From = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L), .Label = c("A", "B"), class = "factor"), To = structure(c(1L,
2L, 3L, 4L, 5L, 2L, 4L, 6L, 7L), .Label = c("B", "C", "D", "E",
"J", "L", "M"), class = "factor"), Weight = c(3, 4, 1, 5, 8,
0.5, 2.5, 3, 5)), .Names = c("From", "To", "Weight"), class = "data.frame", row.names = c(NA,
-9L))
# get the combinations per column 1
gr <- lapply(split(d, d$From), function(x) combn(as.character(x$To), length(x$To)-1 ))
gr
$A
[,1] [,2] [,3] [,4] [,5]
[1,] "B" "B" "B" "B" "C"
[2,] "C" "C" "C" "D" "D"
[3,] "D" "D" "E" "E" "E"
[4,] "E" "J" "J" "J" "J"
$B
[,1] [,2] [,3] [,4]
[1,] "C" "C" "C" "E"
[2,] "E" "E" "L" "L"
[3,] "L" "M" "M" "M"
# calculate the Means using the two lists and the Map function
res <- Map(f = function(x, y){
apply(x, 2, function(z, zi){
mean(zi[ zi[,2] %in% z,3])
}, y)
},gr, split(d, d$From))
# And the final result data.frame
cbind.data.frame(From=rep(names(gr), unlist(lapply(gr, ncol))),
to=do.call(rbind, lapply(gr, function(x) cbind(apply(x,2,paste, collapse=",")))),
Ave=unlist(res))
From to Ave
A1 A B,C,D,E 3.250000
A2 A B,C,D,J 4.000000
A3 A B,C,E,J 5.000000
A4 A B,D,E,J 4.250000
A5 A C,D,E,J 4.500000
B1 B C,E,L 2.000000
B2 B C,E,M 2.666667
B3 B C,L,M 2.833333
B4 B E,L,M 3.500000