如何将函数应用于数组边距并创建成对组合矩阵

时间:2016-11-08 16:24:24

标签: r

我正在使用R来应用一个自编写函数,该函数在数据帧的列边距上输入两个数字向量和一个数字参数。数据框中的每一列都是一个数字向量,我想执行成对计算并创建一个矩阵,该矩阵具有列的所有可能组合以及指示的计算结果。所以基本上我想生成一个类似于cor()函数产生的行为。

# Data
> head(d)
            1         2         3            4
1 -1.01035342 1.2490665 0.7202516  0.101467379
2 -0.50700743 1.4356733 0.9032172 -0.001583743
3 -0.09055243 0.4695046 2.4487632 -1.082570048
4  1.11230416 0.2885735 0.3534247 -0.728574628
5 -1.96115691 0.4831158 1.5650052  0.648675605
6  1.20434218 1.7668086 0.2170858 -0.161570792
> cor(d)
            1           2           3           4
1  1.00000000  0.08320968 -0.06432155  0.04909430
2  0.08320968  1.00000000 -0.04557743 -0.01092765
3 -0.06432155 -0.04557743  1.00000000 -0.01654762
4  0.04909430 -0.01092765 -0.01654762  1.00000000

我找到了这个有用的答案:Perform pairwise comparison of matrix

基于此,我编写了这个函数,它使用了另一个自编函数compareFunctions()

createProbOfNonEqMatrix <- function(df,threshold){
  combinations <- combn(ncol(df),2)
  predDF <- matrix(nrow = length(density(df[,1])$y)) # df creation for predicted values from density function
  for(i in 1:ncol(df)){
    predCol <- density(df[,i])$y # convert df of original values to df of predicted values from density function
    predDF <- cbind(predDF,predCol)
  }
  predDF <- predDF[,2:ncol(predDF)]
  colnames(predDF) <- colnames(df) # give the predicted values column names as in the original df
  predDF <- as.matrix(predDF)
  out.mx <- apply( X=combinations,MARGIN = 2,FUN = "compareFunctions",
    predicted_by_first = predDF[,combinations[1]],
    predicted_by_second = predDF[,combinations[2]],
    threshold = threshold)
return(out.mx)
}

predicted_by_firstpredicted_by_secondthreshold是compareFunctions的输入。但是我收到以下错误:

 Error in FUN(newX[, i], ...) : unused argument (newX[, i]) 

绝望中我试过了:

createProbOfNonEqMatrix <- function(df,threshold){
  combinations <- combn(ncol(df),2)
  predDF <- matrix(nrow = length(density(df[,1])$y)) 
  for(i in 1:ncol(df)){
    predCol <- density(df[,i])$y 
    predDF <- cbind(predDF,predCol)
  }
  predDF <- predDF[,2:ncol(predDF)]
  colnames(predDF) <- colnames(df) 
  predDF <- as.matrix(predDF)
  out.mx <- apply(
    X=combinations,MARGIN = 2,FUN = function(x) {
      diff <- abs(predDF[,x[1]]-predDF[,x[2]])
      boolean <- diff<threshold
      acceptCount <- length(boolean[boolean==TRUE])
      probability <- acceptCount/length(diff)
      return(probability)
    }
    )
return(out.mx)
}

它似乎确实有效,但它没有返回成对矩阵,而是给我一个向量:

> createProbOfNonEqMatrix(d,0.001)
[1] 0.10351562 0.08203125 0.13476562 0.13085938 0.14843750 0.10937500

即使它意味着在apply()内再次写入功能代码,您是否可以指导我如何制作所需的成对矩阵?此外,如果你能告诉我如何跟踪成对比较的执行情况,我们将不胜感激。 谢谢, 亚历

1 个答案:

答案 0 :(得分:0)

您的输出以combinations :( 1,2),(1,3),(1,4),(2,3)中的对的顺序为您提供计算结果,( 2,4),(3,4)。如果要将其组织成对称方阵,可以对结果进行基本操作,例如:如下:

out.mx<-c(0.10351562, 0.08203125, 0.13476562, 0.13085938, 0.14843750, 0.10937500)
out.mtx<-matrix(nrow=ncol(df1),ncol=ncol(df1))
out.mtx[,]<-1
for (i in 1:length(combinations[1,])){
  a<-combinations[1,i]
  b<-combinations[2,i]
  out.mtx[a,b]<-out.mtx[b,a]<-out.mx[i]
}
out.mtx

给你

          [,1]      [,2]       [,3]      [,4]
[1,] 1.00000000 0.1035156 0.08203125 0.1347656
[2,] 0.10351562 1.0000000 0.13085938 0.1484375
[3,] 0.08203125 0.1308594 1.00000000 0.1093750
[4,] 0.13476562 0.1484375 0.10937500 1.0000000