solr查询语法的潜在语法错误列表?

时间:2012-08-31 16:05:56

标签: solr syntax-error

我正在尝试查看是否有完整的solr语法错误列表。我的目标是创建一个“清理”前端用户查询的功能,这样就不会导致语法错误。

到目前为止,我发现了两个错误:

EOF

如果查询以大写形式结束AND,OR,NOT等,则会引发EOF错误 修复:小写查询(因为查询设置为非区分大小写)

未识别的字段信息

如果查询包含冒号,请参阅“长篇学术标题:Witty Subtitle Here”。 修复:用空格替换:的所有实例。

我希望这是我需要修复的所有问题,但是如果还有其他任何solr语法错误我应该注意并控制,那将是非常有用的!

1 个答案:

答案 0 :(得分:0)

我不确定是否有任何完整的语法错误列表,但这里有一些,我们照顾:

1) encoding issues: special characters like %, & etc should not be
passed as it is as they may ruin the whole query

2) cases of two asterisks together: ** may cause infinite loops or
put the system down to its knees, if leading and trailing wildcards
are accepted. Case when a search term is just one asterisk isn't
allowed in our system either

3) (optionally) for boolean queries ensure that opening and closing
brackets match

4) strip the punctuation, but do it with care, e.g. if U.S. turns
into US, then to ensure findability (recall matters to us), we make
sure same happens during the tokenization. Also we identify urls and
don't remove punctuation from them

5) some errors may relate to malformed proximity operators (like
near, ~), e.g. we don't allow them to be nested or boolean operators
inside them

我还要说,可以通过您为用户自己定义的语法来控制一些语法错误。那是不允许他们你不想让他们。这也会在您的用户和您的应用程序之间形成某种搜索合同。提供一些类似工具提示的信息也很好,这些信息会告诉用户可以使用什么样的典型语法。