我有一个Django / PostgreSQL应用程序,可以显示哪个用户离特定用户最近。它使用PostGIS 2.0 KNN(K Nearest Neighbors)< - > ORDER BY子句中的运算符列出最近的用户。我在初始数据集中发现的是,其中两个搜索结果出现故障(所有距离都是从加利福尼亚州洛杉矶测量的):
Member, City, State, Distance (miles)
user1, North Las Vegas, NV, 239
user2, Phoenix, AZ, 365
user3, Provo, UT, 568
user4, Twin Falls, ID, 630
user5, Albuquerque, NM, 673
user6, Portland, OR, 828
user7, Bozeman, MT, 896
user8, Seattle, WA, 962
user9, Boulder, CO, 834 <- Out of order!
user10, Laramie, WY, 862 <- Out of order!
user11, Naperville, IL, 1756
成员名称只是Django的contrib.auth.models用户类的用户名列。包含几何信息的UserAccount类定义如下:
class UserAccount(models.Model):
user = models.OneToOneField(User, primary_key=True, unique=True)
address_line_1 = models.CharField(max_length=30)
address_line_2 = models.CharField(max_length=30, blank=True)
city = models.CharField(max_length=30)
region = models.CharField(max_length=30, blank=True)
postal_code = models.CharField(max_length=10, blank=True)
country = models.ForeignKey('Country')
measurement_sys = models.CharField(max_length=5) # US or Metric
# User's home (default) and current longitude and latitude
home_lon = models.FloatField(default=0.0)
home_lat = models.FloatField(default=0.0)
current_lon = models.FloatField(default=0.0)
current_lat = models.FloatField(default=0.0)
# GeoDjango-specific fields
home_point = models.PointField(srid=4326)
current_point = models.PointField(srid=4326)
objects = models.GeoManager()
这是我的Django视图中的查询:
def members(request, template):
"""View all members of the website."""
uid = request.session['uid'] # PK from User table
# Get the current user's lon/lat and measurement system
try:
ua = UserAccount.objects.get(user_id=uid)
lon = ua.current_lon
lat = ua.current_lat
measurement_sys = ua.measurement_sys
except UserAccount.DoesNotExist as e:
return HttpResponseRedirect(reverse('unable-to-display-members'))
# Define the proximity query.
if measurement_sys == 'US':
multiplier = 0.000621371 # Convert to miles
else:
multiplier = 0.001 # Convert to kilometers
query = "SELECT \
ua.user_id, \
au.username, \
ua.city, \
ua.region, \
ST_Distance( \
ua.current_point::geography, \
ST_GeographyFromText( \
'SRID=4326;POINT(" \
+ str(lon) \
+ " " \
+ str(lat) + \
")' \
) \
)*" + str(multiplier) + " AS distance \
FROM \
user_account ua \
INNER JOIN \
auth_user au \
ON (ua.user_id = au.id) \
WHERE ua.user_id != %s \
ORDER BY \
ua.current_point::geometry \
<-> \
'SRID=4326;POINT(" + str(lon) + " " + str(lat) + ")'::geometry \
LIMIT 250;"
# Run the proximity query
raw_queryset = UserAccount.objects.raw(query, [uid])
# Paginate results
user_list = [user for user in raw_queryset]
list_size = len(list(user_list))
paginator = Paginator(user_list, 10, 4)
paginator._count = list_size
page = request.GET.get('page')
try:
users = paginator.page(page)
except PageNotAnInteger:
users = paginator.page(1)
except EmptyPage:
users = paginator.page(paginator.num_pages)
return render(request, template, {'users': users})
我的查询中有什么问题吗? KNN操作员有时会“打嗝”并且无法恢复某些结果吗?我问这个是因为当我尝试从我的表中取出两个无序记录,然后为地址更远的用户添加额外的记录(即在IL,LA,MI,NC,PA,NY和ME),所有结果都是正确的顺序。
顺便说一句,我的输入位于here。
谢谢!
答案 0 :(得分:2)
更新的答案:
Postgis有两个近似解决方案,用于kNN邻居功能,因为September 2011:
你的问题是,两者都是近似的,所以它们并不完美。因此,如果您想获得最佳的250个结果,您可以使用它们中的任何一个来检索例如最佳的1000个结果,然后通过ST_DISTANCE和LIMIT 250订购相同的结果,以获得大约1000个中的最佳250个结果。
示例:
SELECT * FROM
(SELECT *,ST_DISTANCE(current_point::geography, 'SRID=4326;POINT(" + str(lon) + " " + str(lat) + ")'::geography ) AS st_dist
FROM ua
ORDER BY current_point::geometry <-> 'SRID=4326;POINT(" + str(lon) + " " + str(lat) + ")'::geometry
LIMIT 1000) AS s
ORDER BY st_dist LIMIT 250;