背景

Question

背景

大家好，我最近发现了Google的开源S2库

https://github.com/google/s2geometry

我目前正在编写一个应用，该应用需要在给定原始目标点的情况下找到K个最接近的点。目前，我正在使用PostgreSQL，在包含纬度/经度值的列上使用地理空间索引来实现此目的-但是，当S2引起我的注意时，我正在寻找替代方法。

问题

我对图书馆不熟悉，对此有一些疑问。

1）是否可以使用S2库找到K个最接近的点

2）S2中的查询相对于地理空间索引（上/下/相同/等）的速度有多快

Answer 1

Google的S2库是一种哈希处理形式。由于它只是一个哈希/ id查找，因此可以用来显着优化地理查找。

一种索引方法可以是：

在相当大的S2单元级别上为您关心的所有点编制索引。您应该评估自己的观点，看看什么水平适合您based on this chart。
在检索时，将搜索点转换为该级别的S2单元，然后根据该点拉出所有候选点。
（可选，具体取决于您关心的精度）计算候选点与搜索点之间的距离并进行排序

要在性能上有所取舍，需要权衡：

在您的点上索引S2单元意味着更多的存储空间（每个id为64位整数）
您可能会错过查询所依据的S2单元之外的点。您可以在多个级别的S2上建立索引，以确保检索到足够的点。取决于您的点的密度，这可能不是问题。
通过S2单元ID检索实际上并不会给您点之间的距离-您必须自己计算一下

这是Node S2 library中的代码示例：

const s2 = require('@radarlabs/s2');

const user1LongLat = [-73.95772933959961, 40.71623280185081];
const user2LongLat = [-73.95927429199219, 40.71629785715124];
const user3LongLat = [-73.99206161499023, 40.688708709249646];

const user1S2 = ["user1", new s2.CellId(new s2.LatLng(user1LongLat[1], user1LongLat[0])).parent(13)];
const user2S2 = ["user2", new s2.CellId(new s2.LatLng(user2LongLat[1], user2LongLat[0])).parent(13)];
const user3S2 = ["user3", new s2.CellId(new s2.LatLng(user3LongLat[1], user3LongLat[0])).parent(13)];

const groups = {};
[user1S2, user2S2, user3S2].forEach(([userId, cellId]) => {
  const group = groups[cellId.token()] || [];
  group.push(userId);
  groups[cellId.token()] = group;
});

const searchPointLongLat = [-73.98991584777832, 40.69528168934989];
const searchPointS2 = new s2.CellId(new s2.LatLng(searchPointLongLat[1], searchPointLongLat[0])).parent(13);

console.log(searchPointS2.token()); // '89c25a4c'
console.log(groups); // { '89c2595c': [ 'user1', 'user2' ], '89c25a4c': [ 'user3' ] }

const closePoints = groups[searchPointS2.token()];
console.log(closePoints); // [ 'user3' ]

这里是map visualization of the S2 tokens that were created。

长话短说，是的，它是哈希的一种形式，因此您可以通过权衡存储来获得更快的性能，但是在某些方面，您可能必须根据需求进行调整。

可以使用S2库找到K个最近的点（有效）吗？

背景

问题

1 个答案: