Question

我有postgres数组列，我希望将其编入索引，然后在搜索中使用它。以下是示例，

手机= [“+ 175（2）123-25-32”，“123456789”，“+ 12 111-111-11”]

我使用analyze api分析了令牌，elasticsearch将字段标记为多个字段，如下所示

curl -XGET 'localhost:9200/_analyze' -d '
{
  "analyzer" : "standard",
  "text" : [ "+175 (2) 123-25-32", "123456789", "+12 111-111-11" ]
}'


{
  "tokens": [
    {
      "token": "analyzer",
      "start_offset": 6,
      "end_offset": 14,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "standard",
      "start_offset": 19,
      "end_offset": 27,
      "type": "<ALPHANUM>",
      "position": 2
    },
    {
      "token": "text",
      "start_offset": 33,
      "end_offset": 37,
      "type": "<ALPHANUM>",
      "position": 3
    },
    {
      "token": "175",
      "start_offset": 45,
      "end_offset": 48,
      "type": "<NUM>",
      "position": 4
    },
    {
      "token": "2",
      "start_offset": 50,
      "end_offset": 51,
      "type": "<NUM>",
      "position": 5
    },
    {
      "token": "123",
      "start_offset": 53,
      "end_offset": 56,
      "type": "<NUM>",
      "position": 6
    },
    {
      "token": "25",
      "start_offset": 57,
      "end_offset": 59,
      "type": "<NUM>",
      "position": 7
    },
    {
      "token": "32",
      "start_offset": 60,
      "end_offset": 62,
      "type": "<NUM>",
      "position": 8
    },
    {
      "token": "123456789",
      "start_offset": 66,
      "end_offset": 75,
      "type": "<NUM>",
      "position": 9
    },
    {
      "token": "12",
      "start_offset": 80,
      "end_offset": 82,
      "type": "<NUM>",
      "position": 10
    },
    {
      "token": "111",
      "start_offset": 83,
      "end_offset": 86,
      "type": "<NUM>",
      "position": 11
    },
    {
      "token": "111",
      "start_offset": 87,
      "end_offset": 90,
      "type": "<NUM>",
      "position": 12
    },
    {
      "token": "11",
      "start_offset": 91,
      "end_offset": 93,
      "type": "<NUM>",
      "position": 13
    }
  ]
}

我想要弹性搜索要么不进行标记化并存储没有特殊字符的数字，例如“+175（2）123-25-32”要转换为“+17521232532”或简单地将数字索引为原样，以便它将在搜索结果中提供。

我的映射如下，

{ :id => { :type => "string"}, :secondary_phones => { :type => "string" } }

以下是我尝试查询的方法

      settings = {
        query: {
          filtered: {
            filter: {
              bool: {
                should: [
                  { terms: { phones: [ "+175 (2) 123-25-32", "123456789", "+12 111-111-11" ] } },
                ]
              }
            }
          }
        },
        size: 100,
      }

P.S我也尝试删除特殊字符，但没有运气。

我确信这是可以实现的，我错过了一些东西。建议请。

感谢。

Answer 1

如果您只想对数据执行完全匹配，就像在terms查询示例中一样，最好的方法是将映射中的index映射参数设置为{{1} }。看看documentation here。

这将完全禁用值的分析（或标记化），并将字段的内容（数组中的每个项目）视为单个标记/关键字。

Elasticsearch搜索电话号码

1 个答案: