当尝试再次执行正则表达式查询弹性搜索实例时,我得到一个解析异常
我尝试使用的正则表达式是.*((<31>)).*
,我认为lucene无法解析。
这是堆栈跟踪...
> 17:04:58.585 [elasticsearch[Spider-Girl][search][T#7]] DEBUG
> org.elasticsearch.action.search.type - [Spider-Girl] [termweb][4],
> node[3N7Y3PKuRYKqZJ8zQBpx3Q], [P], s[STARTED]: Failed to execute
> [org.elasticsearch.action.search.SearchRequest@5cb6e4d6] lastShard
> [true] org.elasticsearch.search.SearchParseException: [termweb][4]:
> from[0],size[50]: Parse Failure [Failed to parse source
> [{"from":0,"size":50,"query":{"bool":{"must":[{"term":{"language.id":41}},{"regexp":{"fields.22":{"value":".*((<31>)).*"}}}]}}}]]
> at
> org.elasticsearch.search.SearchService.parseSource(SearchService.java:634)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.search.SearchService.createContext(SearchService.java:507)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:480)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:252)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
> [elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
> [elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
> [elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
> [elasticsearch-1.1.0.jar:na] at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_05] at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_05] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_05]
> Caused by: java.lang.IllegalArgumentException: '31' not found at
> org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:555)
> ~[lucene-core-4.7.0.jar:4.7.0 1570806 - simon - 2014-02-22 08:25:23]
> at
> org.apache.lucene.util.automaton.RegExp.findLeaves(RegExp.java:571)
> ~[lucene-core-4.7.0.jar:4.7.0 1570806 - simon - 2014-02-22 08:25:23]
> at
> org.apache.lucene.util.automaton.RegExp.findLeaves(RegExp.java:569)
> ~[lucene-core-4.7.0.jar:4.7.0 1570806 - simon - 2014-02-22 08:25:23]
> at
> org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:499)
> ~[lucene-core-4.7.0.jar:4.7.0 1570806 - simon - 2014-02-22 08:25:23]
> at
> org.apache.lucene.util.automaton.RegExp.toAutomatonAllowMutate(RegExp.java:478)
> ~[lucene-core-4.7.0.jar:4.7.0 1570806 - simon - 2014-02-22 08:25:23]
> at
> org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:442)
> ~[lucene-core-4.7.0.jar:4.7.0 1570806 - simon - 2014-02-22 08:25:23]
> at org.apache.lucene.search.RegexpQuery.<init>(RegexpQuery.java:90)
> ~[lucene-core-4.7.0.jar:4.7.0 1570806 - simon - 2014-02-22 08:25:23]
> at org.apache.lucene.search.RegexpQuery.<init>(RegexpQuery.java:79)
> ~[lucene-core-4.7.0.jar:4.7.0 1570806 - simon - 2014-02-22 08:25:23]
> at
> org.elasticsearch.index.query.RegexpQueryParser.parse(RegexpQueryParser.java:123)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:223)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.index.query.BoolQueryParser.parse(BoolQueryParser.java:93)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.index.query.QueryParseContext.parseInnerQuery(QueryParseContext.java:223)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:330)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.index.query.IndexQueryParserService.parse(IndexQueryParserService.java:260)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.search.query.QueryParseElement.parse(QueryParseElement.java:33)
> ~[elasticsearch-1.1.0.jar:na] at
> org.elasticsearch.search.SearchService.parseSource(SearchService.java:622)
> ~[elasticsearch-1.1.0.jar:na] ... 11 common frames omitted
如果有人可以在lucene / elastic搜索中给出我可能的正则表达式警告,我会很高兴。
答案 0 :(得分:0)
原来问题是elasticsearch无法在regexp查询'<'
和'>'
中处理此符号。
这个查询失败了:
{
"query": {
"regexp": {
"fields.23": ".*(<approved>)|(<rejected>).*"
}
}
}
这很好用:
{
"query": {
"regexp": {
"fields.23": ".*(approved)|(rejected).*"
}
}
}
ES中的数据是:
{
"_index": "termweb",
"_type": "term",
"_id": "62",
"_score": 1.0,
"_source": {
"id": "62",
"name": "aardvark",
"language": {
"id": 41,
"name": "English",
"iso": "ENG"
},
"definition": null,
"conceptId": 61,
"displayId": "3d770",
"fields": {
"22": "10",
"23": "<approved><rejected>",
"25": "is a medium-sized, burrowing, nocturnal mammal native to Africa."
},
"parentId": "61"
}
}