正如标题所述:我想计算相邻时间点之间的距离,并找到所有时间点的n
最短路径。
我在下面发布了一个例子。在此示例中,有2个明确区域(在3D空间中),其中点被定位。在每个区域内,我们有多个时间点。我想在强制执行时间点排序时计算T1 --> T2 --> ... --> T8
之间的距离。我最终将此视为某种树,我们最初从T1的第一个点分支到T2的2个(或更多)点,然后从每个T2分支到每个T3,等等。一旦构建了树,我们就可以计算出来了从开始到结束的每条路径的距离,返回距离最小的顶部n
路径。简而言之,这里的目标是将每个T1节点与其各自的最短路径连接起来。也许可能有更高效或更好的方法来做到这一点。
示例数据:
> example
Timepoint Centre.int.X Centre.int.Y Centre.int.Z
FOV4.Beads.T1.C2 T1 5.102 28.529 0.789
FOV4.Beads.T1.C2.1 T1 37.904 50.845 0.837
FOV4.Beads.T2.C2 T2 37.905 50.843 1.022
FOV4.Beads.T2.C2.1 T2 5.083 28.491 0.972
FOV4.Beads.T4.C2 T4 37.925 50.851 0.858
FOV4.Beads.T4.C2.1 T4 5.074 28.479 0.785
FOV4.Beads.T5.C2 T5 37.908 50.847 0.977
FOV4.Beads.T5.C2.1 T5 5.102 28.475 0.942
FOV4.Beads.T6.C2 T6 5.114 28.515 0.643
FOV4.Beads.T6.C2.1 T6 37.927 50.869 0.653
FOV4.Beads.T7.C2 T7 37.930 50.875 0.614
FOV4.Beads.T7.C2.1 T7 5.132 28.525 0.579
FOV4.Beads.T8.C2 T8 4.933 28.674 0.800
FOV4.Beads.T8.C2.1 T8 37.918 50.816 0.800
生成上图的基线代码发布如下:
require(scatterplot3d)
with(example, {
s3d <- scatterplot3d(Centre.int.X, Centre.int.Y, Centre.int.Z,
pch=19,
cex.symbols=2,
col.axis="grey", col.grid="lightblue",
angle=45,
xlab="X",
ylab="Y",
zlab="Z")
})
这是一个相对干净的例子,但我的一些数据非常混乱,这就是为什么我试图避免聚类方法(例如k-means,dbscan等)。任何帮助,将不胜感激!
编辑:添加结构细节。
structure(list(Timepoint = structure(c(1L, 1L, 2L, 2L, 4L, 4L,
5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L), .Label = c("T1", "T2", "T3",
"T4", "T5", "T6", "T7", "T8"), class = "factor"), Centre.int.X = c(5.102,
37.904, 37.905, 5.083, 37.925, 5.074, 37.908, 5.102, 5.114, 37.927,
37.93, 5.132, 4.933, 37.918), Centre.int.Y = c(28.529, 50.845,
50.843, 28.491, 50.851, 28.479, 50.847, 28.475, 28.515, 50.869,
50.875, 28.525, 28.674, 50.816), Centre.int.Z = c(0.789, 0.837,
1.022, 0.972, 0.858, 0.785, 0.977, 0.942, 0.643, 0.653, 0.614,
0.579, 0.8, 0.8)), .Names = c("Timepoint", "Centre.int.X", "Centre.int.Y",
"Centre.int.Z"), class = "data.frame", row.names = c("FOV4.Beads.T1.C2",
"FOV4.Beads.T1.C2.1", "FOV4.Beads.T2.C2", "FOV4.Beads.T2.C2.1",
"FOV4.Beads.T4.C2", "FOV4.Beads.T4.C2.1", "FOV4.Beads.T5.C2",
"FOV4.Beads.T5.C2.1", "FOV4.Beads.T6.C2", "FOV4.Beads.T6.C2.1",
"FOV4.Beads.T7.C2", "FOV4.Beads.T7.C2.1", "FOV4.Beads.T8.C2",
"FOV4.Beads.T8.C2.1"))
答案 0 :(得分:1)
不是很优雅,但它可以找到最短的路径。
distance.matrix <- as.matrix(dist(example[,2:4], upper = TRUE, diag = TRUE))
t1s <- grep("T1", rownames(distance.matrix))
paths <- lapply(t1s, function (t) {
path <- rownames(distance.matrix)[t]
distance <- NULL
for (i in c(2,4:8))
{
next.nodes <- grep(paste0("T", i), rownames(distance.matrix))
next.t <- names(which.min(distance.matrix[t,next.nodes]))
path <- c(path, next.t)
distance <- sum(distance, distance.matrix[t,next.t])
t <- next.t
}
output <- list(path, distance)
names(output) <- c("Path", "Total Distance")
return(output)
})
编辑:切断一些不需要的行。
答案 1 :(得分:1)
这是Python中的一个实现:
from io import StringIO
import numpy as np
import pandas as pd
# Read data
s = """Name Timepoint Centre.int.X Centre.int.Y Centre.int.Z
FOV4.Beads.T1.C2 T1 5.102 28.529 0.789
FOV4.Beads.T1.C2.1 T1 37.904 50.845 0.837
FOV4.Beads.T2.C2 T2 37.905 50.843 1.022
FOV4.Beads.T2.C2.1 T2 5.083 28.491 0.972
FOV4.Beads.T4.C2 T4 37.925 50.851 0.858
FOV4.Beads.T4.C2.1 T4 5.074 28.479 0.785
FOV4.Beads.T5.C2 T5 37.908 50.847 0.977
FOV4.Beads.T5.C2.1 T5 5.102 28.475 0.942
FOV4.Beads.T6.C2 T6 5.114 28.515 0.643
FOV4.Beads.T6.C2.1 T6 37.927 50.869 0.653
FOV4.Beads.T7.C2 T7 37.930 50.875 0.614
FOV4.Beads.T7.C2.1 T7 5.132 28.525 0.579
FOV4.Beads.T8.C2 T8 4.933 28.674 0.800
FOV4.Beads.T8.C2.1 T8 37.918 50.816 0.800"""
df = pd.read_table(StringIO(s), sep=" ", skipinitialspace=True, index_col=0, header=0)
# Get time point ids
ts = sorted(df.Timepoint.unique())
# Get the spatial points in each time point
points = [df[df.Timepoint == t].iloc[:, -3:].values.copy() for t in ts]
# Get the spatial point names in each time point
point_names = [list(df[df.Timepoint == t].index) for t in ts]
# Find the best next point starting from the end
best_nexts = []
accum_dists = [np.zeros(len(points[-1]))]
for t_prev, t_next in zip(reversed(points[:-1]), reversed(points[1:])):
t_dists = np.linalg.norm(t_prev[:, np.newaxis, :] - t_next[np.newaxis, :, :], axis=-1)
t_dists += accum_dists[-1][np.newaxis, :]
t_best_nexts = np.argmin(t_dists, axis=1)
t_accum_dists = t_dists[np.arange(len(t_dists)), t_best_nexts]
best_nexts.append(t_best_nexts)
accum_dists.append(t_accum_dists)
# Reverse back the best next points and accumulated distances
best_nexts = list(reversed(best_nexts))
accum_dists = list(reversed(accum_dists))
# Reconstruct the paths
paths = []
for i, p in enumerate(point_names[0]):
cost = accum_dists[0][i]
path = [p]
idx = i
for t_best_nexts, t_point_names in zip(best_nexts, point_names[1:]):
next_idx = t_best_nexts[idx]
path.append(t_point_names[next_idx])
idx = next_idx
paths.append((path, cost))
for i, (path, cost) in enumerate(paths):
print("Path {} (total distance {}):".format(i, cost))
print("\n".join("\t{}".format(p) for p in path))
print()
输出:
Path 0 (total distance 1.23675871386137):
FOV4.Beads.T1.C2
FOV4.Beads.T2.C2.1
FOV4.Beads.T4.C2.1
FOV4.Beads.T5.C2.1
FOV4.Beads.T6.C2
FOV4.Beads.T7.C2.1
FOV4.Beads.T8.C2
Path 1 (total distance 1.031072818390815):
FOV4.Beads.T1.C2.1
FOV4.Beads.T2.C2
FOV4.Beads.T4.C2
FOV4.Beads.T5.C2
FOV4.Beads.T6.C2.1
FOV4.Beads.T7.C2
FOV4.Beads.T8.C2.1
说明:
它与Viterbi algorithm基本相同。从最后开始,将每个最终节点的初始成本分配给零。然后,对于每对连续时间点t_prev
和t_next
,您计算每个可能的点对之间的距离,并在t_next
中添加先前累积的点数成本。然后为t_prev
中的每个点选择成本最低的下一个点,并继续前一个时间点。最后,对于每个时间点的每个点,best_nexts
都包含下一个时间点的最佳点。
重建只是在best_nexts
中遵循这些指数的问题。对于每个可能的初始点,在下一个时间点选择最佳点并继续。