再一次...... 我有2个表'博客'和'评论'。博客可以包含n条评论(博客--1:n--评论)。到目前为止,我使用以下select将数据插入到solr索引中:
<entity name="blog" dataSource="mssqlDatasource" pk="id"
transformer="ClobTransformer"
query="SELECT b.id, b.market, b.title AS blogTitle, b.message AS
blogMessage, c.message AS commentMessage, c.secondColumn
FROM blog b LEFT JOIN comment c ON b.id = c.source_id
AND c.source_type = 'blog'">
<field column="blogMessage" name="blogMessage" clob="true" />
<field column="commentMessage" name="commentMessage" clob="true" />
</entity>
索引结果如下:
<doc>
<str name="id">1</str>
<str name="market">12</str>
<str name="title">blog of title 1</str>
<str name="blogMessage">message of blog 1</str>
<str name="commentMessage">message of comment</str>
<str name="scondColumn">Im the second column from comment</str>
</doc>
<doc>
<str name="id">1</str>
<str name="market">12</str>
<str name="title">blog of title 1</str>
<str name="blogMessage">message of blog 1</str>
<str name="commentMessage">message of comment - Im the second comment</str>
<str name="scondColumn">Im the second column from comment</str>
</doc>
我会说这是愚蠢的,因为我在同一博客上获得的索引数据太多,只是评论不同。是否可以将“评论”设置为“子实体”,如下所示:
<entity name="blog" dataSource="mssqlDatasource" pk="id"
transformer="ClobTransformer"
query="SELECT b.id, b.market, b.title AS blogTitle, b.message AS
blogMessage
FROM blog b">
<field column="blogMessage" name="blogMessage" clob="true" />
<entity name="comment" dataSource="mssqlDatasource" pk="id"
transformer="ClobTransformer"
query="SELECT c.message as commentMessage, c.secondColumn
FROM comment c
WHERE c.source_id = ${blog.id}">
<field column="commentMessage" name="commentMessage" clob="true" />
</entity>
</entity>
这可能吗?结果如何(直到星期一才能测试)?
答案 0 :(得分:1)
您几乎就在那里,如果您想为每个博客文档填写多值评论字段,您需要CachedSqlEntityProcessor
我的看起来很像这样(我没有留下clob变压器位,但你显然需要它们)
<entity name="blog"
query="SELECT b.id,
b.market,
b.title AS blogTitle,
b.message AS
blogMessage
FROM blog b">
<entity name="blog_comment"
query="SELECT c.message as commentMessage,
c.secondColumn,
c.blog_id
FROM comment c"
processor="CachedSqlEntityProcessor"
where="blog_id=blog.id"/>
<entity>
文档应该如下:
<doc>
<str name="id">1</str>
<str name="market">12</str>
<str name="title">blog of title 1</str>
<str name="blogMessage">message of blog 1</str>
<arr name="commentMessage">
<str>message of comment</str>
<str>message of comment - Im the second comment</str>
</arr>
<arr name="secondColumn">
<str>Im the second column from comment</str>
<str>Im the second column from comment</str>
</arr>
</doc>
如果要避免重复值,可能需要执行多个嵌套实体,一个查询每个列。