如何在Azure搜索中匹配此查询

时间:2017-09-08 16:22:19

标签: azure keyword analyzer azure-search

我有这个INDEX

{
  "name": "testentities",
  "fields": [
    {
      "name": "id",
      "type": "Edm.String",
      "key": true,
      "retrievable": true,
       "filterable": true,
       "sortable": true
    },
    {
      "name": "entity_id",
      "type": "Edm.String",
      "searchable": true,
      "sortable": true,
      "facetable": false,
      "retrievable": true,
      "filterable": true,
      "searchAnalyzer":"standard",
      "indexAnalyzer": "custom_analyzer"
    },
    {
      "name": "description",
      "type": "Edm.String",
      "searchable": true,
      "sortable": false,
      "facetable": false,
      "retrievable": true,
      "filterable": true
    },
    {
      "name": "name",
      "type": "Edm.String",
      "searchable": true,
      "sortable": true,
      "facetable": false,
      "retrievable": true,
      "filterable": true
    },
    {
      "name": "entity_type",
      "type": "Edm.String",
      "searchable": true,
      "sortable": true,
      "facetable": true,
      "retrievable": true,
      "filterable": true
    },
    {
      "name": "ancestors",
      "type": "Collection(Edm.String)",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": true,
      "filterable": true
    },
    {
      "name": "calendar_id",
      "type": "Edm.String",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": false,
      "filterable": false
    },
    {
      "name": "currency",
      "type": "Edm.String",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": false,
      "filterable": false
    },
    {
      "name": "timezone",
      "type": "Edm.String",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": false,
      "filterable": false
    },
    {
      "name": "active",
      "type": "Edm.Boolean",
      "retrievable": true,
      "facetable": true,
      "filterable": true
    },
    {
      "name": "kpi_collection",
      "type": "Edm.String",
      "searchable": false,
      "sortable": false,
      "facetable": false,
      "retrievable": false,
      "filterable": false
    },
    {
      "name": "rid",
      "type": "Edm.String"
    }
  ],
  "scoringProfiles": [
    {
      "name": "boostEntity",
      "text": {
        "weights": {
          "entity_id": 9,
          "name": 8,
          "description": 1
        }
      }
    }
  ],
  "analyzers": [
    {
      "name": "custom_analyzer",
      "@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
      "tokenizer":"token1",
      "tokenFilters": [
        "lowercase",
        "entityID_stopWords",
        "entityID_edgeNGram"

      ]
    }
  ],
  "tokenizers":[  
   {  
      "name":"token1",  
      "@odata.type":"#Microsoft.Azure.Search.StandardTokenizerV2"
   }
   ],
  "tokenFilters": [
    {
      "name": "entityID_edgeNGram",
      "@odata.type": "#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
      "minGram": 1,
      "maxGram": 6
    },
    {
      "name": "entityID_stopWords",
      "@odata.type": "#Microsoft.Azure.Search.StopwordsTokenFilter",
      "stopwords": [
        "store",
        "region",
        "zone",
        "field_org",
        ":"
      ]
    }
  ]
}

如果我执行此查询:

{   “搜索”:“0001”,   “filter”:“entity_type eq'store'”,   “选择”: “名,ENTITY_ID,ENTITY_TYPE,说明,活跃,祖先”,   “count”:“true”

}

我得到这个结果,这是正确的,因为它与在实体ID之后具有高分的名称相匹配。

"@odata.count": 1,
"value": [
    {
        "@search.score": 1.6654625,
        "name": "LensCrafters 0001",
        "entity_id": "store:1",
        "entity_type": "store",
        "description": "2130 Mall Road, Florence, 41042, KY, US",
        "active": true,
        "ancestors": [
            "region:1021",
            "zone:1123",
            "field_org:lenscrafters_na",
            "ROOT"
        ]
    }
]

}

但如果我运行此查询

{
  "search": "1",
  "filter": "entity_type eq 'store' ",
  "select":"name,entity_id,entity_type,description,active,ancestors",
  "count": "true"

}

我得到的结果不正确

 {
            "@search.score": 1.4522386,
            "name": "LensCrafters 1622",
            "entity_id": "store:1622",
            "entity_type": "store",
            "description": "31625 Pacific Hwy S, Spc #E-1, Federal Way, 98003-5645, WA, US",
            "active": true,
            "ancestors": [
                "region:1024",
                "zone:1107",
                "field_org:lenscrafters_na",
                "ROOT"
            ]
        },
        {
            "@search.score": 1.3403159,
            "name": "LensCrafters 1178",
            "entity_id": "store:1178",
            "entity_type": "store",
            "description": "1 W FlatIron Crossing Dr #1104, Broomfield, 80021-8881, CO, US",
            "active": true,
            "ancestors": [
                "region:1019",
                "zone:1122",
                "field_org:lenscrafters_na",
                "ROOT"
            ]
        },
        { 
...............

为什么resulat不是这个,尽管内部评分配置文件entity_is有值9?

 "@odata.count": 1,
    "value": [
        {
            "@search.score": 1.6654625,
            "name": "LensCrafters 0001",
            "entity_id": "store:1",
            "entity_type": "store",
            "description": "2130 Mall Road, Florence, 41042, KY, US",
            "active": true,
            "ancestors": [
                "region:1021",
                "zone:1123",
                "field_org:lenscrafters_na",
                "ROOT"
            ]
        }
    ]
}

这里得分简介?

"scoringProfiles": [
        {
            "name": "boostEntity",
            "text": {
                "weights": {
                    "entity_id": 9,
                    "name": 8,
                    "description": 1
                }
            },
            "functions": [],
            "functionAggregation": null
        }
    ],.............

1 个答案:

答案 0 :(得分:0)

您在entity_id字段上使用自定义分析器,为文本store:1178生成以下标记:1, 11, 117, 1178(您可以使用Analyze API测试分析仪配置)。这意味着,文档 LensCrafters 1622 LensCrafters 1178 匹配查询以及文档 LensCrafters 0001 - 它们都有 1 < / em>在entity_id中。但是,文档 LensCrafters 1622 LensCrafters 1178 在描述中也匹配 1 。因此,他们的得分高于 LensCrafters 0001

要详细了解Azure搜索中的查询处理和自定义分析器,请阅读:How full text search works in Azure Search

您是否希望将edgeNGram令牌过滤器保留在分析链中?为什么呢?