我对Solr 4.6.0有一个非常奇怪的问题。
uniqueKey字段“id”包含每个文档的哈希值,而不是我的字符串值。如果只在Solr管理员中添加一个带有更新请求处理程序的自定义文档,我会得到我指定的ID值“book_45”,这是正确的。
但是当我使用DIH(数据导入处理程序)进行完全导入时,id字段包含每个文档的哈希值,例如“[B @ 53bd370f”而不是我的自定义值。所以问题必须在DIH中。
我的导入脚本:
<dataConfig>
<dataSource
type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://host/database"
user="user"
password="password" />
<document name="project">
<entity name="document" transformer="RegexTransformer"
query="SELECT CONCAT('book_', b.id) AS book_id, b.slug, b.title, b.isbn,
b.publisher, b.releaseYear AS release_year, b.language, b.pageCount AS page_count, b.description,
b.print, b.addedBy_id AS added_by_id, b.dt AS created,
GROUP_CONCAT(a.name SEPARATOR ';') AS authors
FROM Book b
LEFT JOIN author_book ab ON ab.book_id = b.id
LEFT JOIN Author a ON a.id = ab.author_id
GROUP BY b.id
">
<field column="book_id" name="id" />
<field column="slug" name="book_slug" />
<field column="title" name="book_title" />
<field column="isbn" name="book_isbn" />
<field column="publisher" name="book_publisher" />
<field column="release_year" name="book_release_year" />
<field column="language" name="book_language" />
<field column="page_count" name="book_page_count" />
<field column="description" name="book_description" />
<field column="print" name="book_print" />
<field column="added_by_id" name="book_added_by_id" />
<field column="created" name="book_created" />
<field column="authors" splitBy=";" name="authors" />
</entity>
</document>
schema.xml中的id字段(与默认发布的core collection1中的id字段相同):
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<uniqueKey>id</uniqueKey>
有谁知道我错过了什么?
答案 0 :(得分:1)
[B @ 53bd370f不是散列,而是byte []。toString()的结果。无论Mysql返回什么,都被视为byte []而不是String。
尝试将id转换为varchar或char,如下所示:
SELECT cast(CONCAT('book_', b.id) as CHAR) AS book_id...