计算路径排序的总距离

时间:2018-03-07 00:04:02

标签: r euclidean-distance

我有一个距离矩阵和一个data.frame的排序,我想计算每个排序(行)的总距离。

距离矩阵(由as.matrix(dist(x, upper=TRUE, diag=TRUE))生成):

               FOV5.1.T4.C1 FOV5.1.T4.C1.1 FOV5.1.T4.C2 FOV5.1.T4.C2.1
FOV5.1.T4.C1      0.0000000     11.5454430    0.3431676     13.2814257
FOV5.1.T4.C1.1   11.5454430      0.0000000   11.5625031      2.8374444
FOV5.1.T4.C2      0.3431676     11.5625031    0.0000000     13.2407547
FOV5.1.T4.C2.1   13.2814257      2.8374444   13.2407547      0.0000000

排序(由expand.grid()生成):

            Var1           Var2
1   FOV5.1.T4.C2   FOV5.1.T4.C1
2 FOV5.1.T4.C2.1   FOV5.1.T4.C1
3   FOV5.1.T4.C2 FOV5.1.T4.C1.1
4 FOV5.1.T4.C2.1 FOV5.1.T4.C1.1

预期产出:

            Var1           Var2          Dist
1   FOV5.1.T4.C2   FOV5.1.T4.C1      0.3431676
2 FOV5.1.T4.C2.1   FOV5.1.T4.C1     13.2814257
3   FOV5.1.T4.C2 FOV5.1.T4.C1.1       ...
4 FOV5.1.T4.C2.1 FOV5.1.T4.C1.1       ...

我希望将一个总距离列附加到排序数据框的末尾,该列将提供从Var1VarN的总距离。

编辑:最终目标是针对排序数据框的许多排序(行)和元素(列)概括此问题。例如:

距离矩阵:

                FOV10.5.T1.C1 FOV10.5.T1.C1.1 FOV10.5.T6.C1 FOV10.5.T6.C1.1 FOV10.5.T7.C2 FOV10.5.T7.C2.1 FOV10.5.T7.C4 FOV10.5.T7.C4.1
FOV10.5.T1.C1        0.000000        9.259314      9.525777        4.920990      8.520076        3.246356     10.429007       12.771907
FOV10.5.T1.C1.1      9.259314        0.000000      2.903446        6.485444      2.604540        6.943048      2.962850       12.658076
FOV10.5.T6.C1        9.525777        2.903446      0.000000        8.185294      1.095356        8.058659      5.763981        9.949294
FOV10.5.T6.C1.1      4.920990        6.485444      8.185294        0.000000      7.233955        1.724583      6.384782       15.156368
FOV10.5.T7.C2        8.520076        2.604540      1.095356        7.233955      0.000000        7.054426      5.528189       10.060419
FOV10.5.T7.C2.1      3.246356        6.943048      8.058659        1.724583      7.054426        0.000000      7.488958       13.938926
FOV10.5.T7.C4       10.429007        2.962850      5.763981        6.384782      5.528189        7.488958      0.000000       15.570799
FOV10.5.T7.C4.1     12.771907       12.658076      9.949294       15.156368     10.060419       13.938926     15.570799        0.000000

订购:

              Var1            Var2            Var3            Var4   Dist
1    FOV10.5.T1.C1   FOV10.5.T7.C4   FOV10.5.T7.C2   FOV10.5.T6.C1   sum(Var1 --> Var2, Var2 --> Var3, Var3 --> Var4)
2  FOV10.5.T1.C1.1   FOV10.5.T7.C4   FOV10.5.T7.C2   FOV10.5.T6.C1   ...
3    FOV10.5.T1.C1 FOV10.5.T7.C4.1   FOV10.5.T7.C2   FOV10.5.T6.C1   ...
4  FOV10.5.T1.C1.1 FOV10.5.T7.C4.1   FOV10.5.T7.C2   FOV10.5.T6.C1   ...
5    FOV10.5.T1.C1   FOV10.5.T7.C4 FOV10.5.T7.C2.1   FOV10.5.T6.C1
6  FOV10.5.T1.C1.1   FOV10.5.T7.C4 FOV10.5.T7.C2.1   FOV10.5.T6.C1
7    FOV10.5.T1.C1 FOV10.5.T7.C4.1 FOV10.5.T7.C2.1   FOV10.5.T6.C1
8  FOV10.5.T1.C1.1 FOV10.5.T7.C4.1 FOV10.5.T7.C2.1   FOV10.5.T6.C1
9    FOV10.5.T1.C1   FOV10.5.T7.C4   FOV10.5.T7.C2 FOV10.5.T6.C1.1
10 FOV10.5.T1.C1.1   FOV10.5.T7.C4   FOV10.5.T7.C2 FOV10.5.T6.C1.1
11   FOV10.5.T1.C1 FOV10.5.T7.C4.1   FOV10.5.T7.C2 FOV10.5.T6.C1.1
12 FOV10.5.T1.C1.1 FOV10.5.T7.C4.1   FOV10.5.T7.C2 FOV10.5.T6.C1.1
13   FOV10.5.T1.C1   FOV10.5.T7.C4 FOV10.5.T7.C2.1 FOV10.5.T6.C1.1
14 FOV10.5.T1.C1.1   FOV10.5.T7.C4 FOV10.5.T7.C2.1 FOV10.5.T6.C1.1
15   FOV10.5.T1.C1 FOV10.5.T7.C4.1 FOV10.5.T7.C2.1 FOV10.5.T6.C1.1
16 FOV10.5.T1.C1.1 FOV10.5.T7.C4.1 FOV10.5.T7.C2.1 FOV10.5.T6.C1.1

1 个答案:

答案 0 :(得分:1)

require(dplyr)
require(reshape2)

points <- replicate(5, sample(1:100, 2, T)) %>% 
            `colnames<-`(letters[1:5]) 

dists <-  points %>% 
            t %>% 
            dist %>% 
            as.matrix %>% 
            melt(value.name = 'dist') %>% 
            mutate_if(is.factor, as.character)

paths <- replicate(5, sample(colnames(points), 4, T)) %>% 
            as.data.frame %>% 
            mutate(tot.dist = NA)

for(i in 1:nrow(paths)){
    d <- numeric(ncol(paths) - 2)
    for(j in 2:(ncol(paths) - 1)){
        d[j - 1] <- dists %>% 
                        filter(Var1 == paths[i, j - 1] & Var2 == paths[i, j]) %>% 
                        select(dist) %>% 
                        unlist
    }
    paths$tot.dist[i] <- sum(d)
}

paths

#   V1 V2 V3 V4 V5  tot.dist
# 1  c  c  e  e  c  99.29753
# 2  a  b  e  d  a 173.82135
# 3  d  e  e  d  a  87.30152
# 4  a  b  a  b  e 251.46679