Searching for a string from a comma separated field in solr

时间:2017-06-12 16:59:00

标签: java spring-mvc solr solrj solr6

I have installed solr-6.5.1 in my Spring MVC Java Web Application refering the following documentations: http://www.baeldung.com/apache-solrj

https://github.com/eugenp/tutorials/tree/master/apache-solrj/src/main/java/com/baeldung/solrjava

I have a POJO declared as shown below:

public class WebContentSearchHB 
{
    private int webContentDefinitionId; 
    private String  pageTitle;
    private String  pageKwd;
    private String  pageDesc;
    private int siteId;
    private int applicationId;
    private Date    pageCreatedTime;
    private Date    pageUpdatedDate ;
    private String webContentData;
    private String webContentType;
    private String category;



    public int getWebContentDefinitionId() 
    {
        return webContentDefinitionId;
    }

    @Field("webContentDefinitionId")
    public void setWebContentDefinitionId(int webContentDefinitionId) 
    {
        this.webContentDefinitionId = webContentDefinitionId;
    }
    public String getPageTitle() 
    {
        return pageTitle;
    }

    @Field("pageTitle")
    public void setPageTitle(String pageTitle) 
    {
        this.pageTitle = pageTitle;
    }
    public String getPageKwd() 
    {
        return pageKwd;
    }

    @Field("pageKwd")
    public void setPageKwd(String pageKwd) 
    {
        this.pageKwd = pageKwd;
    }
    public String getPageDesc() 
    {
        return pageDesc;
    }

    @Field("pageDesc")
    public void setPageDesc(String pageDesc) 
    {
        this.pageDesc = pageDesc;
    }

    public int getSiteId() 
    {
        return siteId;
    }

    @Field("siteId")
    public void setSiteId(int siteId) 
    {
        this.siteId = siteId;
    }

    public int getApplicationId() 
    {
        return applicationId;
    }

    @Field("applicationId")
    public void setApplicationId(int applicationId) 
    {
        this.applicationId = applicationId;
    }

    public Date getPageCreatedTime() 
    {
        return pageCreatedTime;
    }

    @Field("pageCreatedTime")
    public void setPageCreatedTime(Date pageCreatedTime) 
    {
        this.pageCreatedTime = pageCreatedTime;
    }

    public Date getPageUpdatedDate() 
    {
        return pageUpdatedDate;
    }

    @Field("pageUpdatedDate")
    public void setPageUpdatedDate(Date pageUpdatedDate) 
    {
        this.pageUpdatedDate = pageUpdatedDate;
    }

    public String getWebContentData() 
    {
        return webContentData;
    }

    @Field("webContentData")
    public void setWebContentData(String webContentData) 
    {
        this.webContentData = webContentData;
    }

    public String getWebContentType() 
    {
        return webContentType;
    }

    @Field("webContentType")
    public void setWebContentType(String webContentType) 
    {
        this.webContentType = webContentType;
    }

    public String getCategory() {
        return category;
    }

    @Field("category")
    public void setCategory(String category) {
        this.category = category;
    }

}

I haven't created any schema.xml file or edited the existing schema.xml file. I am manually setting the values for each field in the POJO and adding it to the Solr index using my application as follows:

solrClient = new HttpSolrClient.Builder(solrUrl).build();
solrClient.setParser(new XMLResponseParser());
WebContentSearchHB searcHB = new WebContentSearchHB();
//codes to set data 
solrClient.addBean(searcHB);
solrClient.commit();

I have also added the following maven dependency to my pom.xml file

<dependency>
    <groupId>org.apache.solr</groupId>
    <artifactId>solr-solrj</artifactId>
    <version>6.5.1</version>
</dependency>

One of my fields in the WebContentSearchHB class, named category will contain a comma separated string of ids of various categories for that content. A sample data would look like the one shown below:

[
{"pageTitle":["Test page"],
"pageKwd":["Test page"],
"pageDesc":["Test page"],
"applicationId":[1],
"siteId":[5],
"category":["2,6,7,8"],
"pageCreatedTime":["2017-02-17T05:58:19.648Z"],
"pageUpdatedDate":["2017-06-12T03:46:45.489Z"],
"webContentDefinitionId":[4947],
"webContentType":["simplewebcontent.html"],
"id":"717821d9-989e-4c4f-b66a-8b5185ed88ca",
"webContentData":"test"],
"_version_":1570012287149801472}
]

here there are multiple categories added as comma seperated values. Now when I try search for the data in the catagory field as follows:

http://localhost::8983/solr/swcm_qa/select?indent=on&q=category:7*&wt=json

no data is returned. But if I search as follows,

 http://localhost::8983/solr/swcm_qa/select?indent=on&q=category:2*&wt=json

All rows where 2appears as the first value in the comma separated string is returned. How can I search for a string from among the comma separated values in the category field? Also, how can I specify if the field is storing multiple values as comma separated string in the @Field annotation?

1 个答案:

答案 0 :(得分:0)

在类别字段中,“2,6,7,8”被索引为单个字符串

category:["2,6,7,8"]

应该是

category:["2","6","7","8"]

您应该在编制索引之前将过滤器应用于该category字段,以便将单个数值存储到字段中,,作为分隔符

OR

修改q=category:*7*

之类的查询