我们在Azure Search Service中创建了索引,如下所示:
"analyzers": [
{
"@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
"name": "SWMLuceneAlongWithCustomHyphenAnalyser",
"tokenizer": "keyword_v2",
"tokenFilters": [
"lowercase"
],
"charFilters": []
}
此分析器被分配给名为“lowerMachineTag”的属性。现在,当我们使用下面的查询进行搜索时,我们得到了预期的结果:
查询:search=lowerSystemID:/.*it\'s.*/lowerMachineTag:/.*it\'s.*/&$filter=(systemID%20ne%20null)%20and%20(ownerSalesforceRecordID%20eq%20'a0h5B000000gJKfQAM')&$count=true&$top=100&$skip=0
结果:
{
"@odata.context": "https://abcd/indexes('orders-index')/$metadata#docs",
"@odata.count": 4,
"value": [
{
"@search.score": 0.1862714,
"systemID": "*1QXEDL8E2V8MGBY",
"machineTag": "It's me",
"systemIDMachineTag": "*1QXEDL8E2V8MGBY|It's me",
"machineTagSystemID": "It's me|*1QXEDL8E2V8MGBY",
"lowerMachineTag": "it's me",
"lowerSystemID": "*1qxedl8e2v8mgby",
"ownerSalesforceRecordID": "a0h5B000000gJKfQAM",
"parentSalesforceRecordID": "a0h5B000000gJKfQAM"
},
{
"@search.score": 0.16417237,
"systemID": "*1QXEDL8E2V8MGBY",
"machineTag": "It's me",
"systemIDMachineTag": "*1QXEDL8E2V8MGBY|It's me",
"machineTagSystemID": "It's me|*1QXEDL8E2V8MGBY",
"lowerMachineTag": "it's me",
"lowerSystemID": "*1qxedl8e2v8mgby",
"ownerSalesforceRecordID": "a0h5B000000gJKfQAM",
"parentSalesforceRecordID": "a0h5B000000gJKfQAM"
},
{
"@search.score": 0.16417237,
"systemID": "*1QXEDL8E2V8MGBY",
"machineTag": "It's me",
"systemIDMachineTag": "*1QXEDL8E2V8MGBY|It's me",
"machineTagSystemID": "It's me|*1QXEDL8E2V8MGBY",
"lowerMachineTag": "it's me",
"lowerSystemID": "*1qxedl8e2v8mgby",
"ownerSalesforceRecordID": "a0h5B000000gJKfQAM",
"parentSalesforceRecordID": "a0h5B000000gJKfQAM"
},
{
"@search.score": 0.16417237,
"systemID": "*1QXEDL8E2V8MGBY",
"machineTag": "It's me",
"systemIDMachineTag": "*1QXEDL8E2V8MGBY|It's me",
"machineTagSystemID": "It's me|*1QXEDL8E2V8MGBY",
"lowerMachineTag": "it's me",
"lowerSystemID": "*1qxedl8e2v8mgby",
"ownerSalesforceRecordID": "a0h5B000000gJKfQAM",
"parentSalesforceRecordID": "a0h5B000000gJKfQAM"
}
]
}
但是对于分析器配置的一般建议是什么,如果我们应该返回结果,即使我们搜索lowerMachineTag:/。它。 /添加到上述行为
答案 0 :(得分:2)
您似乎在搜索查询中使用正则表达式 - 为此,您还必须在查询字符串中添加“& queryType = full ”。否则,整个搜索术语(“ lowerSystemID:/.* it \'。* / lowerMachineTag:/.* it's。* / ”)将被理解为一个简单的查询,意思是它将使用标准分析仪进行分析,并与任何可搜索的字段进行匹配。通过添加“& queryType = full ”,您的正则表达式将被理解为仅与指定字段匹配。
根据您的问题,如果指定了“ lowerMachineTag:/。it ./”,则它将不匹配上述四个文档中的任何一个,因为在开头的'。'正则表达式会尝试匹配“it”字符前的字符,至少在上面的四个文档中,“lowerMachineTag”的值始终以“it”开头。
如果你要删除起始'。'字符,只使用“ lowerMachineTag:/ it ./”,它仍然不匹配,因为正则表达式必须匹配整个令牌(添加' '会工作:“lowerMachineTag:/ it。 /”)。
您也可以使用nGram_v2 token filter更改分析器定义以使“/it./”正常工作,如下所示:
"analyzers": [
{
"@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
"name": "SWMLuceneAlongWithCustomHyphenAnalyser",
"tokenizer": "keyword_v2",
"tokenFilters": [
"lowercase", “myNGramTokenFilter”
],
"charFilters": []
},
"tokenFilters":[
{
"name":"myNGramTokenFilter",
"@odata.type":"Microsoft.Azure.Search.NGramTokenFilterV2",
"minGram":1,
"maxGram":100
}
]
这仍然会使您原始查询(+“queryType = full”)返回相同的结果,并且在使用“lowerMachineTag:/ it ./".
时也会返回结果我希望这有帮助!