我正在尝试根据tSNE结果创建一个基于其邻域密度的点着色的图 - 即该点周围的邻居数量和到邻居的距离。
给定tSNE结果坐标矩阵:
[,1] [,2]
[1,] -4.2060515 3.1718312
[2,] -4.2671476 5.6677296
[3,] -3.1792470 3.5504695
[4,] -3.2507526 4.7510075
[5,] -4.5662531 3.3866132
[6,] -5.0863544 3.1760014
[7,] -4.7380256 5.5291478
[8,] -5.0510355 5.0373626
[9,] -4.3288679 4.3316772
[10,] -5.2947188 4.6130757
[etc,] ... ...
我希望能够根据上述标准对点进行着色。
但到目前为止,我能得到的就是这个,这只是欧几里德的平均距离,但这不正确:
理想情况下,我喜欢看起来类似于粗模型的东西,其中较近的点颜色比具有较少本地邻居的点颜色更深:
d <- dist(best.tsne, method = "euclidean")`
d.scaled <- quick.scale(apply(as.matrix(d), 2, sum),
floor = 0, ceiling = 1)
ii <- cut(d.scaled,
breaks = seq(min(d.scaled), max(d.scaled), len = 100),
include.lowest = TRUE)
colors <- colorRampPalette(c("white", "blue"))(99)[ii]
我可以分配颜色等等,只需要能够计算得分。
答案 0 :(得分:1)
有许多方法,但最常见的是使用二维内核或生成类似于您所做的测量,但更好地适应数据。
我举几个例子:
1 - Bidimensional内核:
# With kde2d {MASS}
library(MASS)
attach(geyser)
plot(duration, waiting, xlim = c(0.5,6), ylim = c(40,100))
f1 <- kde2d(duration, waiting, n = 50, lims = c(0.5, 6, 40, 100))
image(f1)
2 - 测量ad-hoc(1):
# Trimean 20%
apply(as.matrix(d), 2, mean, trim = 0.8)
3 - ad-hoc测量(2):
# Normalized inverse distance
apply(as.matrix(1/((1+d)/max(1+d))), 2, mean)
此致!!