如何找到数据导入失败文档的原因?

时间:2019-03-21 15:38:44

标签: solr

我正在尝试在solr 7.7.1中为数据重新编制索引。导入后的响应为:

 "Total Documents Failed": "595",

这可能是我添加的文档数。所以我相信我的数据库导入查询有问题。但是,我找不到任何错误日志条目。唯一标识符可能有问题吗?

sql查询似乎是有效的,因为我可以直接成功地发出它。

<entity name="watches" 
        transformer="LogTransformer"
        logTemplate="watches attributes import for classified ID ${classifieds.ID}" logLevel="info"
        query="SELECT
                    a.URL AS url_article,
                    a.article_id,
                    catr_ref.attr_de AS refnumber,
                    cw.ref_id,
                    cw.year,   
                    cw.box,
                    cw.papers,

                    cw.dial_c_id AS dial_id,                                  
                    CONCAT (cw.dial_c_id, '#', catr_dial.attr_de) AS dial,

                    cw.power_reserve,   
                    /* CONCAT (ca_power.value, '#', catr_power.attr_de) AS power, */

                    cw.dial_n_id,                                    
                    CONCAT (cw.dial_n_id, '#', catr_dialn.attr_de) AS dialn,

                    cw.bracelet_type_id,                                    
                    CONCAT (cw.bracelet_type_id, '#', catr_bra_t.attr_de) AS brace_t,

                    cw.case_m_id AS case_id,                                  
                    CONCAT (cw.case_m_id, '#', catr_case.attr_de) AS caseing,

                    cw.water_id, 
                    CONCAT (cw.water_id, '#', catr_water.attr_de) AS water,

                    cw.winding_id,                                  
                    CONCAT (cw.winding_id, '#', catr_wind.attr_de) AS winding,

                    cw.availability AS avail_id,
                    CONCAT (cw.availability, '#', catr_avail.attr_de) AS avail,

                    cw.cond AS cond_id,
                    CONCAT (cw.cond, '#', catr_cond.attr_de) AS cond,

                    /* mark unique watch article to colapse later. e.g. ref#color#bracelet */
                    CONCAT (cw.ref_id, '#', cw.dial_c_id) AS unique_watch
               FROM
                    classifieds_watches AS cw 
               LEFT JOIN cat_attr AS catr_ref   ON catr_ref.attr_id     =  cw.ref_id
               LEFT JOIN cat_attr AS catr_dial  ON catr_dial.attr_id    =  cw.dial_c_id
               LEFT JOIN cat_attr AS catr_dialn ON catr_dialn.attr_id   =  cw.dial_n_id
               LEFT JOIN cat_attr AS catr_bra_t ON catr_bra_t.attr_id   =  cw.bracelet_type_id
               LEFT JOIN cat_attr AS catr_case  ON catr_case.attr_id    =  cw.case_m_id
               LEFT JOIN cat_attr AS catr_water ON catr_water.attr_id   =  cw.water_id
               LEFT JOIN cat_attr AS catr_wind  ON catr_wind.attr_id    =  cw.winding_id
               LEFT JOIN cat_attr AS catr_avail ON catr_avail.attr_id   =  cw.availability
               LEFT JOIN cat_attr AS catr_cond ON catr_cond.attr_id     =  cw.cond                           
               LEFT JOIN articles AS a ON cw.article_id = a.article_id
               WHERE 
                    cw.cl_id  = ${classifieds.id}">
</entity>

玩了一会儿之后,我发现字段“ year”对此负责,就像我在数据集中将值设置为NULL一样,它将导入到solr。

类型:

     <field name="year" type="tint" indexed="true" stored="true" required="false" />
    <fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>

在MySQL中,它的类型为year(4)

据我看,看起来不错。为什么这是个问题?

0 个答案:

没有答案