我正在尝试从CSV,XML和Json等文件中将数据导入solr内核,我对solr还是陌生的,因此这对某些人来说可能很简单,但对我来说,我尝试了许多在线建议,但未获得期望的结果。
所以我有一个json文件,并且通过将以下requestHandler添加到solrconfig.xml中来启用了数据导入:
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">solr-data-config.xml</str>
</lst>
</requestHandler>
在solr-data-config.xml中:
<dataConfig>
<dataSource name="dfs" type="FileDataSource"/>
<document>
<entity name="sourcefile" processor="FileListEntityProcessor" fileName=".*" rootEntity="false" baseDir="${solr.install.dir}/example/exampledocs">
<entity name="entryline" processor="LineEntityProcessor" url="${sourcefile.fileAbsolutePath}" rootEntity="true" dataSource="fds" separator=","/>
</entity>
</document>
更新的版本:
<dataConfig>
<script><![CDATA[
function CategoryPieces(row) {
var pieces = row.get('manu_id_s').split('/');
var arr = new Array();
for (var i=0; i < pieces.length; i++) {
row.put('manu_id_s' + i, pieces[i].trim());
arr[i] = pieces[i].trim();
}
row.put('manu_id_s', (pieces.length - 1).toFixed());
row.put('manu_id_s', arr.join('/'));
row.put('manu_id_s', arr.join('/'));
return row;
}
]]></script>
<dataSource type="FileDataSource" />
<document>
<entity
name="document"
processor="FileListEntityProcessor"
baseDir="${solr.install.dir}/example/exampledocs/khaled"
fileName=".*.xml$"
recursive="false"
rootEntity="false"
dataSource="null">
<entity
name="test"
processor="XPathEntityProcessor"
transformer="script:CategoryPieces"
url="${document.fileAbsolutePath}"
useSolrAddSchema="true"
stream="true">
</entity>
</entity>
</document>
</dataConfig>
当我使用dataimport时,我在solr的用户界面的路径中添加了一个json,csv和xml文件,例如请求:0,提取:26,已跳过:0,已处理:0且日志中没有任何内容,有人可以建议如何将文件中的数据添加到solr吗?
这是我在xml文件中的示例数据:
<add>
<doc>
<field name="id">USD</field>
<field name="name">One Dollar</field>
<field name="manu">Bank of America</field>
<field name="manu_id_s">boa</field>
<field name="cat">currency</field>
<field name="features">Coins and notes</field>
<field name="price_c">1,USD</field>
<field name="inStock">true</field>
</doc>
<doc>
<field name="id">EUR</field>
<field name="name">One Euro</field>
<field name="manu">European Union</field>
<field name="manu_id_s">eu</field>
<field name="cat">currency</field>
<field name="features">Coins and notes</field>
<field name="price_c">1,EUR</field>
<field name="inStock">true</field>
</doc>
<doc>
<field name="id">GBP</field>
<field name="name">One British Pound</field>
<field name="manu">U.K.</field>
<field name="manu_id_s">uk</field>
<field name="cat">currency</field>
<field name="features">Coins and notes</field>
<field name="price_c">1,GBP</field>
<field name="inStock">true</field>
</doc>
<doc>
<field name="id">NOK</field>
<field name="name">One Krone</field>
<field name="manu">Bank of Norway</field>
<field name="manu_id_s">nor</field>
<field name="cat">currency</field>
<field name="features">Coins and notes</field>
<field name="price_c">1,NOK</field>
<field name="inStock">true</field>
</doc>
</add>
现在查询返回的数据如下:
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"q":"*:*",
"_":"1547556805546"}},
"response":{"numFound":4,"start":0,"docs":[
{
"manu_id_s":"boa",
"id":"USD",
"_version_":1622731088078045184},
{
"manu_id_s":"eu",
"id":"EUR",
"_version_":1622731088081190912},
{
"manu_id_s":"uk",
"id":"GBP",
"_version_":1622731088081190913},
{
"manu_id_s":"nor",
"id":"NOK",
"_version_":1622731088082239488}]
}}