具有小样本量的地理空间聚类

时间:2017-08-23 20:26:34

标签: r cluster-analysis

我的样本量非常小,包含16个坐标:

x <- c(13.41667,13.31070,13.58806,13.31070,13.18361,
       13.19694,13.27821,13.25917,13.62833,13.31056,
       13.30170,13.30880,13.40210,13.41010,13.53250,
       13.06220)

y <- c(52.47944,52.45768,52.54944,52.45768,52.43417,
       52.50778,52.50499,52.57444,52.44444,52.45750,
       52.45370,52.56440,52.46750,52.52050,52.38220,
       52.38130)

我首先尝试使用kmeans对它们进行聚类,但我认为面向圆的聚类并不是我想要的。我期待着找到一种可能,每个群集至少有2个点聚集点,这意味着它们的密度

z <- cbind(x,y)
res <- dbscan(z, eps=0.05, minPts = 2)
hullplot(z,res)

DBSCAN Cluster Plot of z with eps=.05 and minPt=2

但是这种方式导致在区域外有许多点的聚类。你们有没有其他想法如何使用这样的小样本来聚类空间数据?

1 个答案:

答案 0 :(得分:3)

尝试放宽eps参数。

kNNdistplot(z, k = 2)

## Looks like the 'knee' is at eps = 0.08ish rather than 0.05

abline(h=.08, col = "red", lty=2)

enter image description here

然后,

res <- z %>% dbscan(., eps = 0.08, MinPts = 2)
hullplot(z, res)

enter image description here