我有一个带有3个zookeeper和2个solr实例的solr云设置。我正在尝试通过dih将xml文件(嵌套文档)中的数据索引到solr中,并尝试删除尾随空格,以便在搜索后不显示空格。
文件样本:
<doc>
<sku>...</sku>
<data>
<date>..</date>
<store>..</store>
<econn>..</econn>
</data>
</doc>
...
...
</product>
i have not shared the DIH , as it is working fine.
i have tried both links :-
https://stackoverflow.com/questions/24570545/is-it-possible-to-get-solrs-dataimporthadler-to-ignore-fields-with-empty-string
https://fossies.org/linux/solr/solr/example/example-DIH/solr/atom/conf/solrconfig.xml
actual file :-
<doc>
<sku>abc </sku>
<data>
<date>2019-19-08</date>
<store>somestore </store>
<econn>false </econn>
</data>
</doc>
expected output after indexing:-
<doc>
<sku>abc</sku>
<data>
<date>2019-19-08</date>
<store>somestore</store>
<econn>false</econn>
</data>
</doc>
both parent and child trailing spaces should be trimmed or either of those ,which depends on context.
答案 0 :(得分:0)
最适合我的解决方案是在data-config.xml文件中应用regexTransformer。
<entity name="foo" transformer="RegexTransformer"
<field column="new_field" xpath="path/to/field/in/xml" regex="(\s|\t)" replaceWith="" />
...
...
...
...
</entity>
有时候答案很简单!!!!!!!