我使用solr的dataimporthandler来索引solr中的postgres数据,我的文档结构有一个父实体和相应的子实体:这里是data-config.xml文件:
<entity name="assessment" pk="assm_pk"
query="select assm_pk,'building' as table_name,assm_no,own_name,oaadhar_no,floor_ar,prp_usg,bld_id,ulb_id from assessment_table"
deltaImportQuery="SELECT assm_pk,'building' as table_name,assm_no,own_name,oaadhar_no,floor_ar,ulb_id,prp_usg,bld_id from assessment_table WHERE assm_pk='${dih.delta.assm_pk}'"
deltaQuery="select assm_pk from assessment_table where update_ts > '${dih.last_index_time}'"
>
<field column="table_name" name="table_name"/>
<field column="assm_no" name="assm_no"/>
<field column="own_name" name="own_name"/>
<field column="oaadhar_no" name="aadhar_no"/>
<field column="floor_ar" name="floor_area"/>
<field column="prp_usg" name="prp_usg"/>
<field column="ulb_id" name="ulb_id"/>
<field column="assm_pk" name="assm_pk"/>
<entity name="building" pk="id" query="select id,bld_id,latitude,longitude,road_name from v2_buildings where CAST(bld_id as text) = '${assessment.bld_id}'"
deltaQuery="select id,bld_id from v2_buildings limit 100"
parentDeltaQuery="select assm_pk from assessment_table p where cast(p.bld_id as text) = '${dih.delta.bld_id}'">
<field column="bld_id" name="bld_id"/>
<field column="latitude" name="latitude"/>
<field column="longitude" name="longitude"/>
<field column="road_name" name="road_name"/>
</entity>
当我运行delta-import命令时,至少有100个文档(与子实体构建相关联应该更新)但solr没有更新任何文档,这里是最终状态:
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"initArgs": [
"defaults",
[
"config",
"db-data-config.xml"
]
],
"command": "status",
"status": "idle",
"importResponse": "",
"statusMessages": {
"Total Requests made to DataSource": "103",
"Total Rows Fetched": "100",
"Total Documents Processed": "0",
"Total Documents Skipped": "0",
"Delta Dump started": "2018-02-20 16:48:43",
"Identifying Delta": "2018-02-20 16:48:43",
"Deltas Obtained": "2018-02-20 16:48:43",
"Building documents": "2018-02-20 16:48:43",
"Total Changed Documents": "0",
"": "Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.",
"Committed": "2018-02-20 16:48:43",
"Time taken": "0:0:0.290"
}
}
以下是delta-import命令的关联日志:
2018-02-20 16:48:43.692 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.JdbcDataSource Time taken for getConnection(): 2
2018-02-20 16:48:43.694 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.DocBuilder Completed ModifiedRowKey for Entity: building rows obtained : 100
2018-02-20 16:48:43.694 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.DocBuilder Completed DeletedRowKey for Entity: building rows obtained : 0
2018-02-20 16:48:43.694 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.SqlEntityProcessor Running parentDeltaQuery for Entity: building
2018-02-20 16:48:43.698 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.SqlEntityProcessor Running parentDeltaQuery for Entity: building
20 ...
...
...
2018-02-20 16:48:43.959 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.DocBuilder Completed parentDeltaQuery for Entity: building
2018-02-20 16:48:43.959 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.DocBuilder Running ModifiedRowKey() for Entity: assessment
2018-02-20 16:48:43.959 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.JdbcDataSource Creating a connection for entity assessment with URL: jdbc:postgresql://localhost:5432/testdata
2018-02-20 16:48:43.962 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.JdbcDataSource Time taken for getConnection(): 2
2018-02-20 16:48:43.967 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.DocBuilder Completed ModifiedRowKey for Entity: assessment rows obtained : 0
2018-02-20 16:48:43.967 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.DocBuilder Completed DeletedRowKey for Entity: assessment rows obtained : 0
2018-02-20 16:48:43.967 INFO (Thread-50) [ x:psqlTest] o.a.s.h.d.DocBuilder Completed parentDeltaQuery for Entity: assessment
为什么没有文档在日志中得到更新,我们可以看到构建父delta查询正在运行。
答案 0 :(得分:0)
我在修补db-config.xml时找到了答案,问题出在parentDeltaQuery上:parentDeltaQuery="select assm_pk from assessment_table p where cast(p.bld_id as text) = '${dih.delta.bld_id}'">
我应该编写 building.bld_id ,而不是使用 dih.delta.bld_id ,其中building是子实体名称。