情况如下:
由于Solr要求每个文档都有一个唯一的ID,我创建了一个字段 solrId ,它从solr.UUIDUpdateProcessorFactory获取其ID。
但是,dataimport只提取了一些项目而没有任何单位。有人能指出我正确的方向吗?
相关段落:
solrconfig.xml中:
<updateRequestProcessorChain name="uuid">
<processor class="solr.UUIDUpdateProcessorFactory">
<str name="fieldName">solrId</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
....
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">wiensued-data-config.xml</str>
<str name="update.chain">uuid</str>
</lst>
</requestHandler>
管理型模式:
<uniqueKey>solrId</uniqueKey>
<fieldType name="uuid" class="solr.UUIDField" indexed="true" />
<!-- solrId is the real ID -->
<field name="solrId" type="uuid" multiValued="false" indexed="true" stored="true" />
<!-- the ID from the database -->
<field name="id" type="int" multiValued="false" indexed="true" stored="true"/>
dataimporthandler配置为将id(从表中)索引到projectId或unitId
堆栈跟踪是:
org.apache.solr.common.SolrException: [doc=null] missing required field: solrId
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:265)
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:107)
at org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:212)
at org.apache.solr.update.AddUpdateCommand$1.next(AddUpdateCommand.java:185)
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:259)
at org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:433)
at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1384)
at org.apache.solr.update.DirectUpdateHandler2.updateDocument(DirectUpdateHandler2.java:920)
at org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:913)
at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:302)
at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:239)
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:194)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:979)
at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1192)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:91)
at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:80)
at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:254)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:526)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:415)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:474)
at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:457)
at java.lang.Thread.run(Thread.java:748)
然而,就我所知,我们提供了solrId
答案 0 :(得分:1)
只需在你的dih配置中修复它,它就会变得更干净,更容易。
只需在项目ID前加一个'p'即可创建id,并将其提供给solr。与单位相同(前置'u')。你明白了这个想法:
<entity name="project" pk="id" query="select concat('p', id) as solrid, ...
当然sql取决于你的数据库。