我们目前正在运行ColdFusion 8,但计划很快转向ColdFusion 10。这一举措最大的问题之一是我们运行的最重要的应用程序之一包括目前使用Verity Collections构建的全文文档搜索。它基本上允许用户搜索数百个PDF文档的文本内容。
我刚刚在我的开发ColdFusion 9实例中创建了一个新的Solr Collection,并尝试使用现有的索引逻辑更新集合,该逻辑每天运行以使用存储在本地服务器上的PDF文档F:\PDFS\[documentId].PDF
来更新集合:
<cfsetting requesttimeout="3600" />
<cfquery name="getDocs" datasource="myDB">
SELECT DISTINCT
itemNo,
edition,
description,
status,
'F:\PDFs\'
CONCAT documentId
CONCAT '.PDF' AS document_file
FROM SKU_ATTRIBUTES
</cfquery>
<cfindex
query="getDocs"
collection="mysolrcollection"
action="refresh"
type="file"
key="document_file"
title="description"
custom1="itemNo"
custom2="status"
custom3="edition" />
它跑了大约10分钟,然后遭到以下例外的轰炸:
Java_heap_space__javalangOutOfMemoryError_Java_heap_space___at_orgapacheluceneutilUnicodeUtilUTF16toUTF8UnicodeUtiljava236___at_orgapachelucenestoreIndexOutputwriteStringIndexOutputjava102___at_orgapacheluceneindexFieldsWriterwriteFieldFieldsWriterjava232___at_orgapacheluceneindexStoredFieldsWriterPerFieldprocessFieldsStoredFieldsWriterPerFieldjava56___at_orgapacheluceneindexDocFieldConsumersPerFieldprocessFieldsDocFieldConsumersPerFieldjava37___at_orgapacheluceneindexDocFieldProcessorPerThreadprocessDocumentDocFieldProcessorPerThreadjava234___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava762___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava745___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2215___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2187___at_orgapachesolrupdateDirectUpdateHandler2addDocDirectUpdateHandler2java238___at_orgapachesolrupdateprocessorRunUpdateProcessorprocessAddRunUpdateProcessorFactoryjava60___at_orgapachesolrhandlerXMLLoaderprocessUpdateXMLLoaderjava140___at_orgapachesolrhandlerXMLLoaderloadXMLLoaderjava69___at_orgapachesolrhandlerContentStreamHandlerBasehandleRequestBodyContentStreamHandlerBasejava54___at_orgapachesolrhandlerRequestHandlerBasehandleRequestRequestHandlerBasejava131___at_orgapachesolrcoreSolrCoreexecuteSolrCorejava1333___at_orgapachesolrservletSolrDispatchFilterexecuteSolrDispatchFilterjava303___at_orgapachesolrservletSolrDispatchFilterdoFilterSolrDispatchFilterjava232___at_orgmortbayjettyservletServletHandler$CachedChaindoFilterServletHandlerjava1089___at_orgmortbayjettyservletServletHandlerhandleServletHandlerjava365___at_orgmortbayjettysecuritySecurityHandlerhandleSecurityHandlerjava216___at_orgmortbayjettyservletSessionHandlerhandleSessionHandlerjava181___at_orgmortbayjettyhan
Java_heap_space__javalangOutOfMemoryError_Java_heap_space___at_orgapacheluceneutilUnicodeUtilUTF16toUTF8UnicodeUtiljava236___at_orgapachelucenestoreIndexOutputwriteStringIndexOutputjava102___at_orgapacheluceneindexFieldsWriterwriteFieldFieldsWriterjava232___at_orgapacheluceneindexStoredFieldsWriterPerFieldprocessFieldsStoredFieldsWriterPerFieldjava56___at_orgapacheluceneindexDocFieldConsumersPerFieldprocessFieldsDocFieldConsumersPerFieldjava37___at_orgapacheluceneindexDocFieldProcessorPerThreadprocessDocumentDocFieldProcessorPerThreadjava234___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava762___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava745___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2215___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2187___at_orgapachesolrupdateDirectUpdateHandler2addDocDirectUpdateHandler2java238___at_orgapachesolrupdateprocessorRunUpdateProcessorprocessAddRunUpdateProcessorFactoryjava60___at_orgapachesolrhandlerXMLLoaderprocessUpdateXMLLoaderjava140___at_orgapachesolrhandlerXMLLoaderloadXMLLoaderjava69___at_orgapachesolrhandlerContentStreamHandlerBasehandleRequestBodyContentStreamHandlerBasejava54___at_orgapachesolrhandlerRequestHandlerBasehandleRequestRequestHandlerBasejava131___at_orgapachesolrcoreSolrCoreexecuteSolrCorejava1333___at_orgapachesolrservletSolrDispatchFilterexecuteSolrDispatchFilterjava303___at_orgapachesolrservletSolrDispatchFilterdoFilterSolrDispatchFilterjava232___at_orgmortbayjettyservletServletHandler$CachedChaindoFilterServletHandlerjava1089___at_orgmortbayjettyservletServletHandlerhandleServletHandlerjava365___at_orgmortbayjettysecuritySecurityHandlerhandleSecurityHandlerjava216___at_orgmortbayjettyservletSessionHandlerhandleSessionHandlerjava181___at_orgmortbayjettyhan request: http://localhost:8983/solr/mysolrcollection/update?commit=true&waitFlush=false&waitSearcher=false&wt=javabin
当我在ColdFusion Administrator中查看Solr Collection时,它比原始的Verity Collection大得多 - 现有的Verity Collection大约84-85MB,包含9000+个文档,而这个是1.3GB,只有847个文档。
此搜索功能对于应用程序至关重要,我担心如果迁移到Solr不起作用,我们将不得不推迟升级到CF10。
答案 0 :(得分:0)
这听起来像是一次性导入过程。您是否尝试过将结果批量编译为每次迭代500个文档。根据我的经验,当页面超过1分钟时,Coldfusion表现不佳。
答案 1 :(得分:0)
确保已安装ColdFusion Hotfix 2 for ColdFusion 9.0.1。
Cumulative Hot Fix 2 | ColdFusion 9.0.1
Hotfix包含Solr的一些主要错误修正,特别是在索引.PDF文件时。或者安装ColdFusion 9.0.2,但不再支持Verity。所以你将无法在Verity和Solr之间切换。