cts:启用通配符时,搜索会返回未过滤搜索选项的意外结果

时间:2017-02-09 15:48:47

标签: indexing marklogic marklogic-7

我正在使用"未经过滤的" 选项执行 cts:search 并启用通配符搜索(表示传递"通配符&# 34;

在我的数据库中,我已经插入了下面粘贴的5个xml文档。

在下面的cts中:查询 journalTitle 元素的值是否包含通配符(*),它会返回所有5个文档。

例如:" d *"," di *"," dixi *"

即使我将" mohi * t" 作为 journalTitle 元素的值传递,我也会获得结果中的所有五个文档。

对于"已过滤" 选项,它工作正常。

我很好奇为什么会这样做?还请让我知道如何更正"未经过滤的" 选项。

我在google上搜索过很多但是没有找到解决方案。

请在下面找到cts:search query和xml files

CTS:查询

cts:search(fn:collection(), cts:element-query(
        xs:QName("root"), 
        cts:and-query(
          (
            cts:element-value-query(xs:QName("sourceType"), "JA", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), 
            cts:element-value-query(xs:QName("journalTitle"), "mohi*t", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), 
            cts:element-value-query(xs:QName("title"), "title1", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), 
            cts:element-value-query(xs:QName("volume"), "volume0", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1)
          ), 
          ()
        ), 
        ()
       ),"unfiltered")

XML内容 - 粘贴了所有五个xmls:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>Dinesh</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>
-
<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>Dixit</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>
-
<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>Prashant</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>
-
<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>GAYARI</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>
-
<?xml version="1.0" encoding="UTF-8"?>
<root>
    <journalTitle>KEVAL</journalTitle>
    <sourceType>JA</sourceType>
    <title>title1</title>
    <volume>volume0</volume>
</root>

您可能需要xdmp:plan结果,因此我将其粘贴到

下面

xdmp:计划结果:

<qry:query-plan xmlns:qry="http://marklogic.com/cts/query">
    <qry:info-trace>xdmp:eval("xdmp:plan(cts:search(fn:collection(), cts:element-query(&amp;#10;   ...", (), &lt;options xmlns="xdmp:eval"&gt;&lt;database&gt;12874763000056740838&lt;/database&gt;&lt;root&gt;C:\RSuite\modules...&lt;/options&gt;)</qry:info-trace>
    <qry:info-trace>Analyzing path for search: fn:collection()</qry:info-trace>
    <qry:info-trace>Step 1 is searchable: fn:collection()</qry:info-trace>
    <qry:info-trace>Path is fully searchable.</qry:info-trace>
    <qry:info-trace>Gathering constraints.</qry:info-trace>
    <qry:info-trace>Search query contributed 1 constraint: cts:element-query(fn:QName("", "root"), cts:and-query((cts:element-value-query(fn:QName("", "sourceType"), "JA", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), cts:element-value-query(fn:QName("", "journalTitle"), "mohi*t", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), cts:element-value-query(fn:QName("", "title"), "title1", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1), cts:element-value-query(fn:QName("", "volume"), "volume0", ("case-insensitive","diacritic-sensitive","punctuation-sensitive","whitespace-sensitive","wildcarded","lang=en"), 1)), ()), ())</qry:info-trace>
    <qry:partial-plan>
        <qry:or-two-queries>
            <qry:element-query>
                <qry:key>10866465315185201428</qry:key>
                <qry:annotation>element(root)</qry:annotation>
                <qry:and-query>
                    <qry:term-query weight="1">
                        <qry:key>15329831187071590131</qry:key>
                        <qry:annotation>element(sourceType,value("JA"))</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="0">
                        <qry:key>3029765743981997321</qry:key>
                        <qry:annotation>element(journalTitle)</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="1">
                        <qry:key>4206353216190327061</qry:key>
                        <qry:annotation>element(title,value("title1"))</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="1">
                        <qry:key>7729558342335907080</qry:key>
                        <qry:annotation>element(volume,value("volume0"))</qry:annotation>
                    </qry:term-query>
                </qry:and-query>
            </qry:element-query>
            <qry:and-two-queries>
                <qry:term-query weight="0">
                    <qry:key>837267169796541076</qry:key>
                    <qry:annotation>link-child(descendant(element(root)))</qry:annotation>
                </qry:term-query>
                <qry:and-query>
                    <qry:term-query weight="1">
                        <qry:key>15329831187071590131</qry:key>
                        <qry:annotation>element(sourceType,value("JA"))</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="0">
                        <qry:key>3029765743981997321</qry:key>
                        <qry:annotation>element(journalTitle)</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="1">
                        <qry:key>4206353216190327061</qry:key>
                        <qry:annotation>element(title,value("title1"))</qry:annotation>
                    </qry:term-query>
                    <qry:term-query weight="1">
                        <qry:key>7729558342335907080</qry:key>
                        <qry:annotation>element(volume,value("volume0"))</qry:annotation>
                    </qry:term-query>
                </qry:and-query>
            </qry:and-two-queries>
        </qry:or-two-queries>
    </qry:partial-plan>
    <qry:info-trace>Executing search.</qry:info-trace>
    <qry:final-plan>
        <qry:and-query>
            <qry:or-two-queries>
                <qry:element-query>
                    <qry:key>10866465315185201428</qry:key>
                    <qry:annotation>element(root)</qry:annotation>
                    <qry:and-query>
                        <qry:term-query weight="1">
                            <qry:key>15329831187071590131</qry:key>
                            <qry:annotation>element(sourceType,value("JA"))</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="0">
                            <qry:key>3029765743981997321</qry:key>
                            <qry:annotation>element(journalTitle)</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="1">
                            <qry:key>4206353216190327061</qry:key>
                            <qry:annotation>element(title,value("title1"))</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="1">
                            <qry:key>7729558342335907080</qry:key>
                            <qry:annotation>element(volume,value("volume0"))</qry:annotation>
                        </qry:term-query>
                    </qry:and-query>
                </qry:element-query>
                <qry:and-two-queries>
                    <qry:term-query weight="0">
                        <qry:key>837267169796541076</qry:key>
                        <qry:annotation>link-child(descendant(element(root)))</qry:annotation>
                    </qry:term-query>
                    <qry:and-query>
                        <qry:term-query weight="1">
                            <qry:key>15329831187071590131</qry:key>
                            <qry:annotation>element(sourceType,value("JA"))</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="0">
                            <qry:key>3029765743981997321</qry:key>
                            <qry:annotation>element(journalTitle)</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="1">
                            <qry:key>4206353216190327061</qry:key>
                            <qry:annotation>element(title,value("title1"))</qry:annotation>
                        </qry:term-query>
                        <qry:term-query weight="1">
                            <qry:key>7729558342335907080</qry:key>
                            <qry:annotation>element(volume,value("volume0"))</qry:annotation>
                        </qry:term-query>
                    </qry:and-query>
                </qry:and-two-queries>
            </qry:or-two-queries>
        </qry:and-query>
    </qry:final-plan>
    <qry:info-trace>Selected 5 fragments</qry:info-trace>
    <qry:result estimate="5"/>
</qry:query-plan>

如果有任何语法错误,请道歉。

如果您需要更多信息,请告诉我。

2 个答案:

答案 0 :(得分:3)

通配符搜索依赖于适当的索引或过滤。您是否检查过您在数据库中启用了fast element trailing wildcard searches,也可能启用了trailing wildcard searches?这适用于至少有4个起始字符的模式。对于三个起始字符,您还需要启用fast element character searches,也可以启用three character searches

MarkLogic还允许对仅以两个或一个字符开头的模式进行准确的未过滤通配符搜索。一种方法是启用two character searchesone character searches选项,但根据文档,如果您将三个字符与单词词典结合使用,则不需要:

  

两个字符搜索指定是否应创建索引   启用通配符搜索,其中搜索模式包含两个   连续的非通配符(例如,ab *)。这个指数是   如果您有三个字符搜索和一个单词词典,则不需要。

     

一个字符搜索指定是否应创建索引   启用通配符搜索,其中搜索模式包含单个搜索模式   非通配符(例如,a *)。如果,则不需要此索引   你有三个字符搜索和一个单词词典。

(来源:管理员界面帮助标签)

Thnx to Dave指向Understanding the Wildcard Indexes,其中所有内容都有更详细的解释。

HTH!

答案 1 :(得分:3)

最好使用代码点排序规则启用带有三个字符通配符的单词词典。一个和两个字符索引非常昂贵。