提高搜索速度性能的提示:搜索结果检索?

时间:2015-04-17 12:55:17

标签: performance search resultset marklogic

我正在使用search:search来检索相当大的数据集(0.5K到3K)来绘制一些图表,并使用搜索API中的facet goodies来允许最终用户进行分析 - 过滤数据集,然后动态重建图表。从数据库中检索并将数据提取到客户端web-app需要花费大约5-10秒(当数据集较小时减少),这对最终用户来说不是很愉快。我知道这是一个概念性问题,允许用户轻松地对绘制的数据进行整形以及此过程的速度之间的薄弱平衡/折衷,但是,请非常感谢任何帮助/提示。谢谢!

P.S。:我尝试使用search:parse然后将查询传递给cts:query会产生如下错误:[1.0-ml] XDMP-NONMIXEDCOMPLEXCONT: fn:data(<cts:and-query qtextjoin="AND" strength="20" qtextgroup="( )" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:cts="http://marklogic.com/cts"><cts:element-range-query qtextpre="Jaar:" qtextref="cts:annotati...</cts:and-query>) -- Node has complex type with non-mixed complex content

选项:

<search:options xmlns:search="http://marklogic.com/appservices/search">
  <search:search-option>filtered</search:search-option>
  <search:page-length>3050</search:page-length>
  <search:term apply="term">
    <search:empty apply="all-results"/>
    <search:term-option>punctuation-insensitive</search:term-option>
    <search:term-option>unstemmed</search:term-option>
  </search:term>
  <search:grammar>
    <search:quotation>"</search:quotation>
    <search:implicit>
      <cts:and-query strength="20" xmlns:cts="http://marklogic.com/cts"/>
    </search:implicit>
    <search:starter strength="30" apply="grouping" delimiter=")">(</search:starter>
    <search:starter strength="40" apply="prefix" element="cts:not-query">-</search:starter>
    <search:joiner strength="10" apply="infix" element="cts:or-query" tokenize="word">OR</search:joiner>
    <search:joiner strength="20" apply="infix" element="cts:and-query" tokenize="word">AND</search:joiner>
    <search:joiner strength="30" apply="infix" element="cts:near-query" tokenize="word">NEAR</search:joiner>
    <search:joiner strength="30" apply="near2" consume="2" element="cts:near-query">NEAR/</search:joiner>
    <search:joiner strength="50" apply="constraint">:</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="LT" tokenize="word">LT</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="LE" tokenize="word">LE</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="GT" tokenize="word">GT</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="GE" tokenize="word">GE</search:joiner>
    <search:joiner strength="50" apply="constraint" compare="NE" tokenize="word">NE</search:joiner>
  </search:grammar>
  <search:additional-query>
    <cts:not-query xmlns:cts="http://marklogic.com/cts">
      <cts:or-query>
    <cts:collection-query>
      <cts:uri>All_Intakes</cts:uri>
      <cts:uri>Reports</cts:uri>
    </cts:collection-query>
    <cts:element-query>
      <cts:element xmlns:sem="http://marklogic.com/semantics">sem:triples</cts:element>
      <cts:or-query/>
    </cts:element-query>
      </cts:or-query>
    </cts:not-query>
  </search:additional-query>
  <search:debug>false</search:debug>
  <search:extract-metadata>
    <search:qname elem-name="USER_EI"/>
    <search:qname elem-name="Customer"/>
    <search:qname elem-name="TOTALAMOUNTTENANTEI"/>
    <search:qname elem-name="TOTALAMOUNTTENANTEI2"/>
    <search:constraint-value ref="Medewerker"/>
    <search:constraint-value ref="Klant"/>
    <search:constraint-value ref="TOTALAMOUNTTENANTEI"/>
    <search:constraint-value ref="TOTALAMOUNTTENANTEI2"/>
  </search:extract-metadata>

  <search:transform-results apply="snippet"/>
    <search:constraint name="Klant">
    <search:range type="xs:string" collation="http://marklogic.com/collation/">
      <search:element name="Customer"/>
    </search:range>
  </search:constraint>
  <search:constraint name="Medewerker">
    <search:range type="xs:string" collation="http://marklogic.com/collation/">
      <search:element name="USER_EI"/>
    </search:range>
  </search:constraint>
  <search:constraint name="Jaar">
    <search:range type="xs:int">
      <search:element name="Operation_Year"/>
    </search:range>
  </search:constraint>
  <search:constraint name="Kwartaal">
    <search:range type="xs:int">
      <search:element name="Operation_Quarter"/>
    </search:range>
  </search:constraint>

  <search:return-metrics>true</search:return-metrics>
  <search:return-qtext>true</search:return-qtext>
  <search:return-query>false</search:return-query>
  <search:return-results>true</search:return-results>
  <search:return-similar>false</search:return-similar>
  <search:sort-order direction="descending">
    <search:score/>
    <search:annotation>Relevancy (Desc)</search:annotation>
  </search:sort-order>
</search:options>;

2 个答案:

答案 0 :(得分:2)

听起来你正试图在一个页面中获取完整的搜索结果(start = 1 page-length = 99999999)。您可能希望将策略更改为以较小批量获取结果(页长= 100?),并简单地遍历页面,并动态地将越来越多的节点附加到图表中。一个好的图表库应该支持它。

HTH!

答案 1 :(得分:2)

你提到想要绘制图表和创建构面。两者都将基于价值而不是完整的文件。听起来像你可以返回10个文档值,但从你的方面获得更大的列表。您好像正在从extract-metadata获取值。这是获取某些信息以显示单个搜索结果的好方法,但不是获取有关数据集的摘要信息的好方法。

您的约束都使用范围索引,这很好 - 这些将很快得到解决。对于图表,您可能希望使用/ v1 / values来获取共现值。

我将您的页面长度更改为10并添加&lt; facet-option&gt; limit = 100&lt; / facet-option&gt;在facet中,然后从facet而不是extract-metadata获取图表值。此外,如果您可以安排索引以便unfiltered searches提供准确的结果,那么您的搜索将会更快地运行。