我有一个带有下一个DIHconfigfile的Solr实例
<dataConfig>
<dataSource type="JdbcDataSource" name="db" driver="org.postgresql.Driver" url="jdbc:postgresql://127.0.0.1:5432/db" user="user" password="password" />
<document name="db">
<entity name="sites" query="SELECT url, title, body, last_parse FROM sites where body is not null;"
deltaQuery="SELECT url, title, body, last_parse FROM sites where body is not null and last_parse > '${dih.last_index_time}'"
deltaImportQuery="SELECT url, title, body, last_parse FROM sites where body is not null and last_parse > '${dih.last_index_time}'">
<field column="url" name="url" />
<field column="title" name="title" />
<field column="body" name="body" />
<field column="last_parse" name="last_parse" />
</entity>
</document>
</dataConfig>
它可以正常工作,但不会从Postgres添加新数据。当我第一次启动此增量导入时,它检测到约30万个新行,但仅更新索引中的11个文档。当我尝试再次启动它时,它什么也没检测到。 我的Postgres中有310万行,而Solr索引中只有270万行。我想Solr从Postgres添加所有新行,并且还更新last_parse <比当前日期还早的文档。
UPD-我重写查询,但是它非常慢
<entity name="sites" pk="id"
query="SELECT id, url, title, body, last_parse FROM sites where body is not null;"
deltaQuery="SELECT id FROM sites WHERE body is not null and last_parse > '${dataimporter.last_index_time}'"
deltaImportQuery="SELECT id, url, title, body, last_parse FROM sites where body is not null and id = '${dataimporter.delta.id}'"
>