我已经在Marklogic服务器中上传了一个文本文件,其名称为collections(“ calling-returning”)。 以下是文本文档:
[INFO] [workflowContentListner-1] 2019-01-03 00:00:59,884 com.innodata.bsi.interceptors.MethodLoggingAspect logTimeMethod - Thread Id-25 : -7703835814759006134 - Returning from WorkflowContentDao.deleteCompletedOrFailedContentList(..) Execution time: 16 ms
[INFO] [workflowContentListner-1] 2019-01-03 00:00:59,900 com.innodata.bsi.interceptors.MethodLoggingAspect logTimeMethod - Thread Id-25 : -2561765194895194936 - Calling WorkflowContentDao.getWaitingForContentListToProcess(..) with parameters FTP
[INFO] [workflowContentListner-1] 2019-01-03 00:00:59,900 com.innodata.bsi.interceptors.MethodLoggingAspect logTimeMethod - Thread Id-25 : -2561765194895194936 - Returning from WorkflowContentDao.getWaitingForContentListToProcess(..) Execution time: 0 ms
[INFO] [workflowContentListner-1] 2019-01-03 00:00:59,915 com.innodata.bsi.interceptors.MethodLoggingAspect logTimeMethod - Thread Id-25 : -2041334620910360341 - Calling WorkflowContentDao.getFTPWaitProcessType(..) with parameters ftp://10.103.100.43:21/VARIANTGENERATION/INPUT/30357186.pdf
[INFO] [workflowContentListner-1] 2019-01-03 00:00:59,915 com.innodata.bsi.interceptors.MethodLoggingAspect logTimeMethod - Thread Id-25 : -2041334620910360341 - Returning from WorkflowContentDao.getFTPWaitProcessType(..) Execution time: 0 ms
[INFO] [workflowContentListner-1] 2019-01-03 00:00:59,915 com.innodata.bsi.consumer.WorkflowContentConsumer processWorkflowContent - processWorkflowContent workflow content task: DPC-CENELEC-PUBLISH 01-7915592210 VARIANT_GENERATION
[INFO] [workflowContentListner-1] 2019-01-03 00:00:59,915 com.innodata.bsi.schedule.task.ProcessWorkflowContent failWorkflowContentTask - Failing workflow content task using scheduler because its exceeded 30 min since created DPC-CENELEC-PUBLISH 01-7915592210 VARIANT_GENERATION
[INFO] [workflowContentListner-1] 2019-01-03 00:00:59,931 com.innodata.bsi.interceptors.MethodLoggingAspect logTimeMethod - Thread Id-25 : 8235148762900748472 - Calling WorkflowContentDao.setPickedBy(..) with parameters com.innodata.bsi.domain.WorkflowContentInfo@5f7839bd
[INFO] [workflowContentListner-1] 2019-01-03 00:00:59,931 com.innodata.bsi.interceptors.MethodLoggingAspect logTimeMethod - Thread Id-25 : 8235148762900748472 - Returning from WorkflowContentDao.setPickedBy(..) Execution time: 0 ms
我正在此文档“ 2561765194895194936-正在呼叫”中搜索,号码可以是任何数字。 所以我写了以下查询:
let $search :=cts:search(collection("calling-returning"), cts:word-query(" -
Calling"))
return $search
但是它返回完整的文档。我只想要以下结果类型:
2561765194895194936 - Calling
256176519489514568 - Calling
568651948951566 - Calling
答案 0 :(得分:1)
MarkLogic中的搜索和检索单元是一个文档。如果要分别搜索行,则它们必须是单独的文档。有了匹配的文档后,如果要从中拉出匹配的行,则需要将文档标记成几行,然后在每一行上运行匹配项,例如tokenize($doc,"\n")[cts:contains(text {.}, $query)]
那将不会非常有效,您最好对文本文档进行预处理以添加一些标记(例如,每行的根元素和行元素),然后至少不必这样做整个事物的标记化,尽管事实之后,您仍然需要遍历与每一行匹配的整个事物:$doc//line[cts:contains(., $query)]