假设我有一个类似的查询字符串:
#some terms! "phrase query" in:"my container" in:group_3
或
#some terms!
或
in:"my container" in:group_3 terms! "phrase query"
或
in:"my container" test in:group_3 terms!
正确解析这个问题的最佳方法是什么?
我看过Lucene的SimpleQueryParser,但对我的用例来说似乎相当复杂。我正在尝试使用正则表来解析该查询,但直到现在才真正成功,主要是因为可能在引号内使用空格
有什么简单的想法吗?
我只需要输出一个元素列表,然后我很容易解决剩下的问题:
[
"#some",
"terms!",
"phrase query",
"in:\"my container\"",
"in:group_3"
]
答案 0 :(得分:2)
答案 1 :(得分:0)
对于那些感兴趣的人,这是我用来解决问题的最终Scala / Java解析器,受到这个问题中答案的启发:
def testMatcher(query: String): Unit = {
def optionalPrefix(groupName: String) = s"(?:(?:(?<$groupName>[a-zA-Z]+)[:])?)"
val quoted = optionalPrefix("prefixQuoted") + "\"(?<textQuoted>[^\"]*)\""
val unquoted = optionalPrefix("prefixUnquoted") + "(?<textUnquoted>[^\\s\"]+)"
val regex = quoted + "|" + unquoted
val matcher = regex.r.pattern.matcher(query)
var results: List[QueryTerm] = Nil
while (matcher.find()) {
val quotedResult = Option(matcher.group("textQuoted")).map(textQuoted =>
(Option(matcher.group("prefixQuoted")),textQuoted)
)
val unquotedResult = Option(matcher.group("textUnquoted")).map(textUnquoted =>
(Option(matcher.group("prefixUnquoted")),textUnquoted)
)
val anyResult = quotedResult.orElse(unquotedResult).get
results = QueryTerm(anyResult._1,anyResult._2) :: results
}
println(s"results=${results.mkString("\n")}")
}