如何使用Solr 5正确创建blob索引字段?

时间:2015-04-08 20:12:42

标签: java oracle solr blob

我认为我的问题的标题解释了我需要的很多东西。我正在使用Apache SOLR 5的Data Importer Handler。我配置了solrconfig.xmlschema.xmldata-config.xml。它现在正在工作。

但是,我需要再添加一个字段。 Oracle Blob字段。首先,让我展示一下我的配置:

数据-config.xml中

<dataConfig>

    <!-- Datasource -->
    <dataSource name="myDS" 
                setReadOnly="true" 
                driver="oracle.jdbc.OracleDriver" 
                url="jdbc:oracle:thin:@//server.example.com:1521/service_name" 
                user="user" 
                password="pass"/>

    <document name="products">
        <entity name="product"
                dataSource="myDS" 
                query="select * from products"
                pk="id"
                processor="SqlEntityProcessor">
            <field column="id" name="id" />
            <field column="name" name="name" />
            <field column="price" name="price" />
            <field column="store" name="store" />
            <!-- I've added this blob field -->
            <field column="picture" name="picture" />
        </entity>
    </document>
</dataConfig>

solrconfig.xml中

  <requestHandler name="/products" class="org.apache.solr.handler.dataimport.DataImportHandler">
      <lst name="defaults">
          <str name="config">data-config.xml</str>
      </lst>
  </requestHandler>

  <!-- JDBCs -->
  <lib dir="../../../lib" />

schema.xml中的我的字段

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="_version_" type="long" indexed="true" stored="true"/>
<field name="_text" type="string" indexed="true" stored="false" multiValued="true"/>
<field name="name" type="string" indexed="true" stored="true"/>
<field name="price" type="float" indexed="true" stored="true"/>

<!-- BLOB field -->
<field name="picture" type="binary" indexed="true" stored="true"/>

<copyField source="*" dest="_text"/>
<!-- ommited solr default fields -->

现在,当我启动完全导入程序时,SOLR仅索引某些记录。这是SOLR完成导入后的输出:

Indexing completed. Added/Updated: 64 documents. Deleted 0 documents. (Duration: 04s)
Requests: 1 (0/s), Fetched: 1369 (342/s), Skipped: 0, Processed: 64 (16/s)
Started: less than a minute ago

如您所见,我有1369条记录,但SOLR只索引64个文档。如果我从架构中删除字段picture,或者将indexstored属性设置为false,则SOLR会导入所有文档。

我打开了SOLR日志,在导入blob字段时发现了这个错误:

3436212 [Thread-19] WARN  org.apache.solr.handler.dataimport.SolrWriter  – Error creating document : SolrInputDocument(fields: [name=PRODUCTNAME, price=PRICE, store=STORE, picture=oracle.sql.BLOB@4130607a, _version_=1497915495941144576])
org.apache.solr.common.SolrException: ERROR: [doc=<ID>] Error adding field 'picture'='oracle.sql.BLOB@4130607a' msg=Illegal character .
    at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:176)
    at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:78)
    at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:240)
    at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:166)
    at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
    at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
    at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:931)
    at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1085)
    at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:697)
    at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
    at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
    at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:263)
    at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:511)
    at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
    at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
    at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
    at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
    at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
    at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
Caused by: java.lang.IllegalArgumentException: Illegal character .
    at org.apache.solr.common.util.Base64.base64toInt(Base64.java:150)
    at org.apache.solr.common.util.Base64.base64ToByteArray(Base64.java:117)
    at org.apache.solr.schema.BinaryField.createField(BinaryField.java:89)
    at org.apache.solr.schema.FieldType.createFields(FieldType.java:305)
    at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:48)
    at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:123)
    ... 18 more

我检查了直接查询数据库,它运行正常。我正在使用SOLR 5,ojdbc7和Java 8.如何在SOLR中正确使用二进制字段?


更新

我在picture设置schema.xml中更改了indexed=false的属性。这样:

<!-- BLOB field -->
<field name="picture" type="binary" indexed="false" stored="true"/>

然后,我重新启动SOLR,重新加载我的核心,并再次进行完全导入。没有成功和同样的例外。我导入了上述64个文档,字段图片未出现在JSON响应中 e。我执行的查询是:

/select?q=*%3A*&wt=json&indent=true

0 个答案:

没有答案