Oracle文本查询 - 包含和特殊字符

时间:2013-12-20 10:16:55

标签: oracle oracle11g special-characters reserved-words

我们正在使用oracle文本查询来根据用户输入搜索表。

因此,如果用户键入“蓝天” - 我们希望以OR为基础搜索每个单词,那么我们会这样做 - where contains(columnname,'blue or sky',1)>0

所以我们在传递给查询之前接受用户输入并用空格'or'替换空格。

这样可以正常工作,我们可以在分数上降序,先给我们最相关的条目。

然而,我们遇到了'special' characters的问题 - 它以逗号开头,但后来我在documentation中发现它们有很多。

因此,我们编写了一些代码,用于检测每个'special' character并以escape '\' character作为前缀。这也行。但是还有保留字,例如AND - 所以如果用户输入'jack and jill' - 我们转换为'jack or and or jill',这会因为使用单词'而给出文本查询解析器语法错误'和' - 所以然后试图迎合那些,但不得不试图确定它们的前缀在哪里,并以空格为后缀,以便不要挑出'方便'的例子。当然,它可能是第一个或最后一个词..... grrr必须有一个更简单的方法来做到这一点......

然后我读了{} braces option所以这就逃脱了整个字符串。

问题 - 即使字符串中没有特殊字符,我也可以这样做吗?

此外,我看不出这是如何满足每个单词所需的OR功能 - 所以如果我包含(columnname,'{jack or xxxxx}',1) > 0)它不会返回任何内容。

非常感谢任何建议,谢谢!

1 个答案:

答案 0 :(得分:1)

也许我没有回答你的问题,但为什么不用这种方式进行搜索?

with input as (select 'blue red white' example from dual),
     split_rule as (select '[^ ]+' pattern from dual),
     input_array as (select /* + materialize */ regexp_substr(example,pattern,1,level) word
                     from input, split_rule
                     connect by level <= regexp_count(example,pattern)),
     search_table as (select 'blue sky' item from dual
                      union all
                      select 'green grass' from dual
                      union all
                      select 'red apple' from dual
                      union all
                      select 'orange juice' from dual)
select item string_found,
       word hit_by
from input_array,search_table
where item like '%'||word||'%';

性能应该相同; “具体化”提示禁止Oracle将connect by导出到外部。

如果你想将字符串解析为查询之外的单词 - 只需创建一个 Oracle临时表,用每个请求填写用户搜索词(从上面的查询中模仿“input_array”)并使用它。

编辑1: 至于您向我们提供了一些其他信息,我会更新答案。 顶部保持不变,只需更改查询:

1)如果您的评分基于不同的单词,请仅使用此查询:

with input as (select 'blue red white' example from dual),
     split_rule as (select '[^ ]+' pattern from dual),
     input_array as (select /* + materialize */ regexp_substr(example,pattern,1,level) word
                     from input, split_rule
                     connect by level <= regexp_count(example,pattern)),
     search_table as (select 'blue sky red' item from dual
                      union all
                      select 'green grass' from dual
                      union all
                      select 'red apple blue white' from dual
                      union all
                      select 'orange juice' from dual)
select item string_found, count(*) rate
from input_array,search_table
where item like '%'||word||'%'
group by item
order by 2 desc;

2)如果您的评分基于总点击次数:

with input as (select 'blue red white' example from dual),
     split_rule as (select '[^ ]+' pattern from dual),
     input_array as (select /* + materialize */ regexp_substr(example,pattern,1,level) word
                     from input, split_rule
                     connect by level <= regexp_count(example,pattern)),
     search_table as (select 'blue sky red blue blue' item from dual
                      union all
                      select 'green grass' from dual
                      union all
                      select 'red apple blue white' from dual
                      union all
                      select 'orange juice' from dual)
select item string_found, sum(regexp_count(item,word)) rate
from input_array,search_table
where item like '%'||word||'%'
group by item
order by 2 desc;

编辑2: 要在Oracle 10g中使用regexp_count,请将其替换为

之类的查询
select length(no_double_spaces) - length(replace(no_double_spaces,' ')) + 1 amount_of_words
from (select trim(regexp_replace('blue  red white','[ ]+',' ')) no_double_spaces
      from dual);

现在我展示了如何在编辑1 的第一个查询中使用它:

with input as (select 'blue red white' example from dual),
     split_rule as (select '[^ ]+' pattern from dual),
     input_array as (select /* + materialize */ regexp_substr(example,pattern,1,level) word
                     from input, split_rule
                     connect by level <= (select length(no_double_spaces) - length(replace(no_double_spaces,' ')) + 1 amount_of_words
                                          from (select trim(regexp_replace(example,'[ ]+',' ')) no_double_spaces
                                                from input)
                                         )
                    ),
     search_table as (select 'blue sky red blue blue' item from dual
                      union all
                      select 'green grass' from dual
                      union all
                      select 'red apple blue white' from dual
                      union all
                      select 'orange juice' from dual)
select item string_found, count(*) rate
from input_array,search_table
where item like '%'||word||'%'
group by item
order by 2 desc;