Lucene QueryParser将'AND OR'解释为命令?

时间:2010-08-10 18:28:37

标签: lucene pylucene

我使用以下代码调用Lucene(确切地说是PyLucene):

analyzer = StandardAnalyzer(Version.LUCENE_30)
queryparser = QueryParser(Version.LUCENE_30, "text", analyzer)
query = queryparser.parse(queryparser.escape(querytext))

但请考虑这是否是querytext的内容:

querytext = "THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT"

在这种情况下,即使我使用queryparser.escape,“AND OR”也会使查询者跳闸。如何避免以下错误消息?

    Java stacktrace:
org.apache.lucene.queryParser.ParseException: Cannot parse 'THE FOOD WAS HONESTLY NOT WORTH THE PRICE. MUCH TOO PRICY WOULD NOT GO BACK AND OR RECOMMEND IT': Encountered " <OR> "OR "" at line 1, column 80.
Was expecting one of:
    <NOT> ...
    "+" ...
    "-" ...
    "(" ...
    "*" ...
    <QUOTED> ...
    <TERM> ...
    <PREFIXTERM> ...
    <WILDTERM> ...
    "[" ...
    "{" ...
    <NUMBER> ...
    <TERM> ...
    "*" ...

 at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:187)
     ....
 at org.apache.lucene.queryParser.QueryParser.generateParseException(QueryParser.java:1759)
 at org.apache.lucene.queryParser.QueryParser.jj_consume_token(QueryParser.java:1641)
 at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1268)
 at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1207)
 at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1167)
 at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:182)

3 个答案:

答案 0 :(得分:1)

这不仅仅是OR,而是AND OR

我使用以下解决方法:

query = queryparser.parse(queryparser.escape(querytext.replace("AND OR", "AND or")))

答案 1 :(得分:1)

queryparser.parse仅转义特殊字符(如this page所示)并保持“AND OR”不变,因此在您的情况下不起作用。由于您可能还使用StandardAnalyzer来分析文本,因此索引中的术语已经是小写的。因此,您可以在将整个查询字符串提供给queryparser之前将其更改为小写。小写“和”和“或”不被视为运算符,因此“和或”不会使查询者绊倒。

答案 2 :(得分:0)

我意识到我在这里参加派对的时间比较晚,但是在搜索字符串旁边添加引号是一个更好的选择:

querytext = "\"THE FOOD WAS ... \""