R中列值的组合

时间:2017-03-16 11:46:09

标签: r dataframe dplyr combinations

我有一个数据集如下

From     To     Weight
 A       B        3
 A       C        4
 A       D        1
 A       E        5
 A       J        8
 B       C        0.5
 B       E        2.5
 B       L        3
 B       M        5

我需要在来自中的每个值生成中的4/3(用户定义)值的组合,权重是组合的平均值。这是否可能在R

最终数据集如下所示。

From    To1    To2   To3    To4   Weight(avg)
 A       B      C     D      E     3.25
 A       B      C     D      J     2
 A       B      C     E      J     5
 A     .................................
 A     .................................
 B       C      E     J      M    2.75

1 个答案:

答案 0 :(得分:-1)

你可以尝试

# the data
d <- structure(list(From = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 
                                       2L, 2L), .Label = c("A", "B"), class = "factor"), To = structure(c(1L, 
                                       2L, 3L, 4L, 5L, 2L, 4L, 6L, 7L), .Label = c("B", "C", "D", "E", 
                                       "J", "L", "M"), class = "factor"), Weight = c(3, 4, 1, 5, 8, 
                                       0.5, 2.5, 3, 5)), .Names = c("From", "To", "Weight"), class = "data.frame", row.names = c(NA, 
                                      -9L))
# get the combinations per column 1
gr <- lapply(split(d, d$From), function(x) combn(as.character(x$To), length(x$To)-1 ))
gr
$A
[,1] [,2] [,3] [,4] [,5]
[1,] "B"  "B"  "B"  "B"  "C" 
[2,] "C"  "C"  "C"  "D"  "D" 
[3,] "D"  "D"  "E"  "E"  "E" 
[4,] "E"  "J"  "J"  "J"  "J" 

$B
[,1] [,2] [,3] [,4]
[1,] "C"  "C"  "C"  "E" 
[2,] "E"  "E"  "L"  "L" 
[3,] "L"  "M"  "M"  "M"
# calculate the Means using the two lists and the Map function
res <- Map(f = function(x, y){
  apply(x, 2, function(z, zi){
   mean(zi[ zi[,2] %in% z,3])  
  }, y)
  },gr, split(d, d$From))

# And the final result data.frame
cbind.data.frame(From=rep(names(gr), unlist(lapply(gr, ncol))), 
      to=do.call(rbind, lapply(gr, function(x) cbind(apply(x,2,paste, collapse=",")))),
      Ave=unlist(res))
   From      to      Ave
A1    A B,C,D,E 3.250000
A2    A B,C,D,J 4.000000
A3    A B,C,E,J 5.000000
A4    A B,D,E,J 4.250000
A5    A C,D,E,J 4.500000
B1    B   C,E,L 2.000000
B2    B   C,E,M 2.666667
B3    B   C,L,M 2.833333
B4    B   E,L,M 3.500000