我有一组点(~1000)和一组聚类中心(~100)。我现在想要将已知集群中心考虑在内的一组点集群。所有集群都应从已知的集群中心开始向外扩展,收集距离集群内部最近点不到x米的所有点。
我现在有以下非常标准的PostGIS dbscan查询:
WITH clusters AS (
SELECT
landmark_id, coordinate,
ST_ClusterDBSCAN(coordinate, eps := (30 / 111111.0), minpoints := 10) OVER() AS cluster_id
FROM landmarks
WHERE coordinate IS NOT NULL
)
SELECT
cluster.id, cluster.landmark_ids,
ST_Centroid(cluster.geometry) AS coordinate,
ST_AsGeoJSON(cluster.geometry) AS geometry
FROM (
SELECT
cluster_id AS id,
array_agg(landmark_id) AS landmark_ids,
ST_ConvexHull(ST_Collect(coordinate)) AS geometry
FROM clusters
WHERE cluster_id IS NOT NULL
GROUP BY cluster_id
) AS cluster;
任何指针如何我可以调整上面的查询或编写另一个查询来做我想要的而不诉诸程序代码(如果是这样我会很感激关于它的一些指示)?
答案 0 :(得分:1)
已经在群集中,我不确定你是指那些被第一个群集拾取的,还是包括那些你会递归接收的群集。
此解决方案仅与原始群集进行比较,不会尝试基于递归群集匹配。那将需要一个递归查询,我怀疑它是否会产生更好的答案。
也不确定为什么你决定使用convexhull计算你的质心,我会假设你想要真正的质心,这可以针对ST_Collect输出完成。
WITH cluster1 AS (
SELECT
landmark_id, coordinate,
ST_ClusterDBSCAN(coordinate, eps := (30 / 111111.0), minpoints := 10) OVER() AS cluster_id
FROM landmarks
WHERE coordinate IS NOT NULL
),
clustered AS ( SELECT * FROM cluster1 WHERE cluster_id IS NOT NULL )
clusterall AS (
SELECT
l.landmark_id, l.coordinate, c.cluster_id
FROM landmarks AS l
CROSS JOIN
-- find closest cluster
LATERAL (SELECT cluster_id
FROM clustered AS c
ORDER BY c.coordinate <-> l.coordinate LIMIT 1 ) AS c
-- only look for landmarks not matched to a cluster
WHERE l.landmark_id NOT IN(SELECT c.landmark_id FROM clustered AS c)
UNION ALL
SELECT c.landmark_id, c.coordinate, c.cluster_id
FROM cluster1
)
SELECT
cluster.id, cluster.landmark_ids,
ST_Centroid(cluster.geometry) AS coordinate,
ST_AsGeoJSON(cluster.geometry) AS geometry
FROM (
SELECT
cluster_id AS id,
array_agg(landmark_id) AS landmark_ids,
ST_ConvexHull(ST_Collect(coordinate)) AS geometry
FROM clusterall
GROUP BY cluster_id
) AS cluster;