使用MySQL关系的弹性搜索地理距离查询

时间:2016-07-20 17:55:00

标签: php mysql database elasticsearch

我想在MySQL数据库中使用关系进行弹性搜索地理距离查询。我有一个包含位置数据的表,然后我有另一个与locations表有关系的表。我知道像Elastic Search这样的NoSQL数据库没有针对这样的关系进行优化,但它有可能吗?

这就是我的数据库架构:

CREATE TABLE `locations` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `description` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `lng` decimal(12,8) NOT NULL,
  `lat` decimal(12,8) NOT NULL,
  `deleted_at` timestamp NULL DEFAULT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=26 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE `posts` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `author` int(10) unsigned NOT NULL,
  `location_id` int(10) unsigned NOT NULL,
  `title` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `text` text COLLATE utf8_unicode_ci NOT NULL,
  `deleted_at` timestamp NULL DEFAULT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `posts_author_foreign` (`author`),
  KEY `posts_location_id_foreign` (`location_id`),
  CONSTRAINT `posts_author_foreign` FOREIGN KEY (`author`) REFERENCES `users` (`id`),
  CONSTRAINT `posts_location_id_foreign` FOREIGN KEY (`location_id`) REFERENCES `locations` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=174 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE `comments` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `post_id` int(10) unsigned NOT NULL,
  `author` int(10) unsigned NOT NULL,
  `title` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `text` text COLLATE utf8_unicode_ci NOT NULL,
  `deleted_at` timestamp NULL DEFAULT NULL,
  `created_at` timestamp NULL DEFAULT NULL,
  `updated_at` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `comments_author_foreign` (`author`),
  KEY `comments_post_id_foreign` (`post_id`),
  CONSTRAINT `comments_author_foreign` FOREIGN KEY (`author`) REFERENCES `users` (`id`),
  CONSTRAINT `comments_post_id_foreign` FOREIGN KEY (`post_id`) REFERENCES `posts` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=238 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

这是我的索引映射(我使用official Elasticsearch client for PHP):

<?php
return [
    'index' => 'foodie',
    'body' => [
        'mappings' => [
            'locations' => [
                'properties' => [
                    'id' => ['type' => 'string', 'index' => 'not_analyzed'],
                    'name' => ['type' => 'string'],
                    'description' => ['type' => 'string'],
                    'location' => ['type' => 'geo_point'],
                ],
            ],
            'posts' => [
                'properties' => [
                    'id' => ['type' => 'string', 'index' => 'not_analyzed'],
                    'author' => ['type' => 'string', 'index' => 'not_analyzed'],
                    'location_id' => ['type' => 'string', 'index' => 'not_analyzed'],
                    'title' => ['type' => 'string'],
                    'text' => ['type' => 'string'],
                ],
            ],
            'comments' => [
                'properties' => [
                    'id' => ['type' => 'string', 'index' => 'not_analyzed'],
                    'author' => ['type' => 'string', 'index' => 'not_analyzed'],
                    'post_id' => ['type' => 'string', 'index' => 'not_analyzed'],
                    'title' => ['type' => 'string'],
                    'text' => ['type' => 'string'],
                ],
            ]
        ],
        'settings' => [
            'analysis' => [
                'filter' => [
                ],
                'analyzer' => [
                ],
            ],
        ],
    ],
];

我想对地点和帖子进行查询(并评论(=两个连接),如果这对于性能来说不是太糟糕)我可以按地理距离进行过滤和排序。

我尝试过这样的查询:

[
    'index' => 'index_name',
    'type' => [
        0 => 'posts',
        1 => 'locations',
        2 => 'comments'
    ],
    'body' => [
        'from' => 0,
        'size' => 10,
        'query' => [
            'bool' => [
                'must' => [
                    'multi_match' => [
                        'query' => 'search string',
                        'fields' => [
                            0 => 'title',
                            1 => 'text',
                            2 => 'name',
                            3 => 'description',
                        ],
                        'fuzziness' => 'AUTO',
                        'operator' => 'and',
                    ],
                ],
                'filter' => [
                    'geo_distance' => [
                        'distance' => '100m',
                        'location' => [
                            'lat' => 79.861,
                            'lon' => 107.31,
                        ],
                    ],
                ],
            ],
        ],
    ],
]

它可以工作,但显然会过滤除具有位置数据的位置之外的所有内容。如何在查询中包含相关帖子甚至评论?

谢谢!

1 个答案:

答案 0 :(得分:0)

我没有找到任何好的解决方案来做到这一点,但我通过简单地在帖子和评论上添加位置字段然后在我将其推入Elasticsearch索引时获取相关位置的坐标来解决它。

可能不是最佳解决方案,但它工作得很好,并且由于索引保持完全平坦而非常快。只要我确保位置坐标中的更改传播到相关的帖子和​​评论,我就不会发现这种方法存在任何实际问题,只是在索引中存储重复数据并不是那么干净。