我还是Solr的新手。我试图索引嵌套结构如下,并难以索引SolrJ 6.1。
schema.xml中
<?xml version="1.0" encoding="UTF-8"?>
<schema name="example" version="1.6">
<uniqueKey>id</uniqueKey>
<defaultSearchField>title</defaultSearchField>
...
// Here are described all the fieldType
...
<field name="_root_" type="string" indexed="true" stored="false"/>
<field name="_version_" type="long" indexed="true" stored="false"/>
<field name="id" type="string" multiValued="false" indexed="true" required="true" stored="true"/>
<field name="imdbId" type="string" indexed="true" stored="true"/>
<field name="rating" type="float" indexed="true" stored="true"/>
<field name="title" type="text_en" indexed="true" stored="true"/>
<field name="type" type="string" indexed="true" stored="true"/>
<field name="userId" type="string" indexed="true" stored="true"/>
</schema>
SolrJ Attempt
我分三步完成。
SolrClient solr = new HttpSolrClient.Builder("http://localhost:8983/solr/ml_core").build();
SolrInputDocument doc, childDoc;
String[] line;
CSVReader reader;
// Step 1: Create a document - Very good
reader = new CSVReader(new FileReader("movies.csv")); // structure of the file: movieId,title
while ((line = reader.readNext()) != null) {
doc = new SolrInputDocument();
doc.addField("id", line[0]);
doc.addField("title", line[1]);
doc.addField("type", "film");
solr.add(doc);
}
// Step 2: Updating a document that I created - Very good
reader = new CSVReader(new FileReader("links.csv")); // structure of the file: movieId,imdbId
while ((line = reader.readNext()) != null) {
doc = new SolrInputDocument();
doc.addField("id", line[0]);
Map<String, Object> imdbIdModifier = new HashMap<>(1);
imdbIdModifier.put("set", line[1]);
doc.addField("imdbId", imdbIdModifier); // add the map as the field value
solr.add(doc);
}
// Step 3: Updating deeply nested structures - Here is the error
reader = new CSVReader(new FileReader("ratings.csv")); // structure of the file: movieId,userId,rating
while ((line = reader.readNext()) != null) {
doc = new SolrInputDocument();
doc.addField("id", line[0]);
childDoc = new SolrInputDocument();
childDoc.addField("id", line[0] + "_" + line[1]);
childDoc.addField("userId", line[1]);
childDoc.addField("type", "user");
childDoc.addField("rating", line[2]);
doc.addChildDocument(childDoc);
solr.add(doc);
}
solr.commit();
solr.optimize();
我的问题: http://localhost:8983/solr/ml_core/select?indent=on&q=id:1&wt=json
{
"responseHeader":{
"status":0,
"QTime":1,
"params":{
"q":"id:1",
"indent":"on",
"wt":"json",
"_":"1471440200579"}},
"response":{"numFound":2,"start":0,"docs":[
{
"id":"1",
"title":"Toy Story (1995)",
"type":"film",
"imdbId":"0114709",
"_version_":1542910355358875648},
{
"id":"1",
"_version_":1542910730357964800,
"_root_":"1"}]
}}
回复 - 不正确。 &#34; ID&#34;字段是重复的,但在文件schema.xml中,此字段标记为唯一。
我的查询: http://localhost:8983/solr/ml_core/select?fl= *,[child%20parentFilter = type:film]&amp; indent = on&amp; q = {!parent%20which =%27type:film%27} &安培;重量= JSON
{
"error":{
"msg":"Parent query yields document which is not matched by parents filter, docID=19957",
"trace":"java.lang.IllegalStateException: Parent query yields document which is not matched by parents filter, docID=19957\r\n\tat org.apache.lucene.search.join.ToChildBlockJoinQuery$ToChildBlockJoinScorer.validateParentDoc(ToChildBlockJoinQuery.java:305)\r\n\tat org.apache.lucene.search.join.ToChildBlockJoinQuery$ToChildBlockJoinScorer.access$300(ToChildBlockJoinQuery.java:158)\r\n\tat
...
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)\r\n\tat java.lang.Thread.run(Thread.java:745)\r\n",
"code":500}
}
回应 - 不正确。
我的问题: http://localhost:8983/solr/ml_core/select?indent=on&q=id:1&wt=json
我需要下一个正确答案:
{
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"q":"id:1",
"indent":"on",
"wt":"json",
"_":"1471440410850"}},
"response":{"numFound":1,"start":0,"docs":[
{
"id":"1",
"title":"Toy Story (1995)",
"type":"film",
"imdbId":"0114709",
"_version_":1542910355358875648,
"_root_":"1"}]
}}
我的查询: http://localhost:8983/solr/ml_core/select?fl= *,[child%20parentFilter = type:film]&amp; indent = on&amp; q = {!parent%20which =%27type:film%27} &安培;重量= JSON
我需要下一个正确答案:
{
"responseHeader":{
"status":0,
"QTime":7,
"params":{
"q":"{!parent which='type:film'}",
"indent":"on",
"fl":"*,[child parentFilter=type:film]",
"wt":"json",
"_":"1471440410850"}},
"response":{"numFound":1,"start":0,"docs":[
{
"id":"1",
"title":"Toy Story (1995)",
"type":"film",
"imdbId":"0114709",
"_version_":1542910355358875648,
"_root_":"1",
"_childDocuments_":[
{
"id":"1_Violet",
"userId":"Violet",
"type":"user",
"rating":5.0,
{
"id":"1_Mcka",
"userId":"Mcka",
"type":"user",
"rating":4.0}]}]
}}
我需要做些什么才能获得所需的文档结构?我如何解决这个问题与SolrJ。感谢。