使用Solr 7.72,我们有一个PostgreSQL数据库,该数据库使用大对象功能https://jdbc.postgresql.org/documentation/head/binary-data.html存储二进制数据,并且无法正确配置Solr来对该字段进行索引。使用下面的db-data-config.xml:
<dataConfig>
<dataSource
type="JdbcDataSource"
name="ISO17025V3"
driver="org.postgresql.Driver"
url="jdbc:postgresql://IS-Config-DB:5432/ISO17025V3"
batchSize="0"
user="postgres"
password="xxxxx"
/>
<dataSource name="fieldReader"
type="FieldStreamDataSource"
/>
<uniqueKey>file_name</uniqueKey>
<document>
<entity
name="root"
query="select d.file_name, dbf.file_contents, d.file_label, d.version, d.dir_num
from db_document d
inner join db_files dbf on (d.file_name = dbf.original_file_name and d.version = dbf.document_version and d.revision_no = dbf.revision_no )
where dbf.file_contents is not null and dbf.parent_file_name is null
and d.version = (select max(version) from db_document d2 where d.file_name = d2.file_name)
order by d.file_label"
onError="skip"
dataSource="ISO17025V3">
<field column="file_name" name="file_name" />
<entity
name="blob2"
dataSource="fieldReader"
processor="TikaEntityProcessor"
dataField="root.file_contents" format="text" onError="skip" extractEmbedded="true">
</entity>
</entity>
</document>
</dataConfig>
导致以下Solr异常:
Exception in entity : blob2:java.lang.RuntimeException: unsupported type : class java.lang.Long
at org.apache.solr.handler.dataimport.FieldStreamDataSource.getData(FieldStreamDataSource.java:77)
at org.apache.solr.handler.dataimport.FieldStreamDataSource.getData(FieldStreamDataSource.java:47)
at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:132)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:517)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)
at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
at java.lang.Thread.run(Thread.java:748)
Solr是否可以使用PostgreSQL的大对象功能,还是必须将二进制数据存储为BYTEA?