计算矩阵A和B中所有行之间的欧几里德距离

时间:2015-11-05 19:59:50

标签: r matrix euclidean-distance

我有两个矩阵BN_a,分别有N_bA行。我需要计算aB)中的元素的所有成对组合与b[a, b])中的另一个元素的欧几里德距离,以便计算的输出是N a 由N b 矩阵,其中单元# Example set.seed(1) A <- matrix(rnorm(1000, 5, 50), ncol = 5) B <- matrix(rnorm(10000, 0, 50), ncol = 5) # Return N_a x N_b matrix of euclidean distances, where [a,b] is the # distance from a to b 是从a到b的距离。我在下面开了一个例子。

<html>
<head><title></title>
  <link href="~/Content/themes/base/jquery.ui.autocomplete.css" rel="stylesheet" />
    <script type="text/javascript" >
        $(document).ready(function () {
            alert("hi");
            $("#ValueField").autocomplete({
                source: function (request, response) {
                    $.ajax({
                        url: "/Customer/AutoretrieveCustomer",
                        type: "POST",
                        dataType: "json",
                        data: { term: request.term },
                        success: function (data) {
                            var items = $.map(data, function (item) {
                                return {
                                    label: item.FirstName,
                                    value: item.FirstName
                                };
                            });
                            response(items);
                        }
                    })
                }
            });
         });     
        </script>



</head>
<body>

    <div id="CusView">

            <label for="FirstName">Enter Customer First name : </label>
            Enter value : <input type="text"  id="ValueField" />


    </div>
</body>
</html> 

4 个答案:

答案 0 :(得分:3)

没有循环的单行程序,没有额外的程序包,而且速度稍快一点:

euklDist <- sqrt(apply(array(apply(B,1,function(x){(x-t(A))^2}),c(ncol(A),nrow(A),nrow(B))),2:3,sum))

速度比较:

> microbenchmark(jogo  = for (i in 1:nrow(A)) for (j in 1:nrow(B)) d[i,j] <- sqrt(sum((A[i,]-B[j,])^2)),
+                mra68 = sqrt(apply(array(app .... [TRUNCATED] 
Unit: seconds
  expr      min       lq     mean   median       uq      max neval
  jogo 3.601533 4.724619 5.403420 5.549199 6.098734 6.470888    10
 mra68 1.334661 1.635258 2.473297 2.542550 3.247981 3.348365    10

答案 1 :(得分:2)

# Example
set.seed(1)
A <- matrix(rnorm(1000, 5, 50), ncol = 5)
B <- matrix(rnorm(10000, 0, 50), ncol = 5)
d <- matrix(NA, nrow(A), nrow(B))
for (a in 1:nrow(A)) for (b in 1:nrow(B)) d[a,b] <- sqrt(sum((A[a,]-B[b,])^2))

答案 2 :(得分:0)

这是使用我的一个软件包并行化的解决方案。请注意,github上的当前构建是不稳定的,因此您必须从昨天的先前提交安装。

编辑:v0.7.1现在稳定,你不需要使用commit-ref

如果两个矩阵都很大和/或你有很多核心,这个解决方案只会更快。但是我写这个很有趣,所以:

devtools::install_github("alexwhitworth/imputation", 
  ref= "75723b769ed2ceae8c915d00089a31f059e447aa")
library(microbenchmark)
library(parallel)

f <- function(a, b) {
nnodes <- detectCores()
cl <- makeCluster(nnodes)
d <- do.call("cbind", clusterApply(cl, x= parallel:::splitRows(a, nnodes),
         fun= function(x_sub, b) {
            apply(x_sub, 1, function(i, b) {imputation::dist_q.matrix(x= rbind(i, b), ref= 1L, q=2)}, b= b)
          }, b= b))
stopCluster(cl)
return(d)
}

a <- matrix(rnorm(50000), 1000)
b <- matrix(rnorm(50000), 1000)
d <- matrix(NA, 1000, 1000)
# run on 4 cores
microbenchmark(jogo= for (i in 1:nrow(a)) for (j in 1:nrow(b)) d[i,j] <- sqrt(sum((a[i,]-a[j,])^2)),
               alex= f(a,b), times= 10L)

Unit: seconds
 expr      min       lq     mean   median       uq      max neval cld
 jogo 4.190531 4.196546 4.289265 4.265351 4.358022 4.486445    10   b
 alex 3.585672 3.603485 3.783583 3.760859 3.966435 4.048676    10  a 

如果您真的想要,可以使用library(Rdsm)进行改进......但我建议使用jogo的答案。

答案 3 :(得分:0)

 interdist_func <- function(x, y){
    apply(y, 1, FUN=function(y_i){
      sqrt(colSums((t(x)-y_i)^2))
    })
 }
 

set.seed(1)
A <- matrix(rnorm(1000, 5, 50), ncol = 5)
B <- matrix(rnorm(10000, 0, 50), ncol = 5)

d <- matrix(NA, nrow(A), nrow(B))

microbenchmark(
jogo  = 
for (i in 1:nrow(A)) for (j in 1:nrow(B)) d[i,j] <-sqrt(sum((A[i,]-B[j,])^2)),

mra68 = 
sqrt(apply(array(apply(B,1,function(x){(x-t(A))^2}),c(ncol(A),nrow(A),nrow(B))),2:3,sum)),

roboshea = 
apply(B, 1, FUN=function(B_i){sqrt(colSums((t(A)-B_i)^2))}))

#Unit: milliseconds
#     expr      min        lq      mean    median        uq       max neval cld
#     jogo 486.0123 553.45700 585.69967 580.20000 619.26870  751.2992   100  b 
#    mra68 512.1435 606.38120 653.00116 639.32560 675.40945 1011.6164   100  c
# roboshea  29.5313  32.95525  42.32124  37.87175  41.27385  128.2292   100  a