我希望使用Hibernate Search对实体中的电子邮件地址进行全文搜索。
鉴于以下实体" Person"使用索引字段"电子邮件":
package com.example
import javax.persistence.Entity
import javax.persistence.GeneratedValue
import javax.persistence.GenerationType
import javax.persistence.Id
import org.hibernate.search.annotations.Field
import org.hibernate.search.annotations.Indexed
@Entity
@Indexed
class Person {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
Long id
@Field
String email
}
并给出了存储库
package com.example
import javax.persistence.EntityManager
import org.apache.lucene.search.Query
import org.hibernate.search.jpa.FullTextEntityManager
import org.hibernate.search.jpa.Search
import org.hibernate.search.query.dsl.QueryBuilder
import org.springframework.beans.factory.annotation.Autowired
import org.springframework.stereotype.Repository
@Repository
class SearchRepository {
@Autowired
EntityManager entityManager
FullTextEntityManager getFullTextEntityManager() {
Search.getFullTextEntityManager(entityManager)
}
List<Person> findPeople(String searchText){
searchText = searchText.toLowerCase()+'*'
QueryBuilder qb = fullTextEntityManager.searchFactory
.buildQueryBuilder().forEntity(Person).get()
Query query =
qb
.keyword()
.wildcard()
.onField('email')
.matching(searchText)
.createQuery()
javax.persistence.Query jpaQuery =
fullTextEntityManager.createFullTextQuery(query, Person)
jpaQuery.resultList
}
}
然后以下测试失败:
package com.example
import javax.persistence.EntityManager
import org.hibernate.search.jpa.FullTextEntityManager
import org.hibernate.search.jpa.Search
import org.junit.Test
import org.junit.runner.RunWith
import org.springframework.beans.factory.annotation.Autowired
import org.springframework.boot.test.SpringApplicationConfiguration
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner
import org.springframework.transaction.annotation.Transactional
@RunWith(SpringJUnit4ClassRunner)
@SpringApplicationConfiguration(classes = HibernateSearchWildcardApplication)
@Transactional
class SearchWildcardTest {
@Autowired
SearchRepository searchRepo
@Autowired
PersonRepository personRepo
@Autowired
EntityManager em
FullTextEntityManager getFullTextEntityManager() {
Search.getFullTextEntityManager(em)
}
@Test
void findTeamsByNameWithWildcard() {
Person person = personRepo.save new Person(email: 'foo@bar.com')
fullTextEntityManager.createIndexer().startAndWait()
fullTextEntityManager.flushToIndexes()
List<Person> people = searchRepo.findPeople('foo@bar.com')
assert people.contains(person) // this assertion fails! Why?
}
}
package com.example
import org.springframework.data.repository.CrudRepository
interface PersonRepository extends CrudRepository<Person, Long>{
}
buildscript {
ext {
springBootVersion = '1.2.7.RELEASE'
}
repositories {
mavenCentral()
}
dependencies {
classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}")
classpath('io.spring.gradle:dependency-management-plugin:0.5.2.RELEASE')
}
}
apply plugin: 'groovy'
apply plugin: 'eclipse'
apply plugin: 'spring-boot'
apply plugin: 'io.spring.dependency-management'
jar {
baseName = 'hibernate-search-email'
version = '0.0.1-SNAPSHOT'
}
sourceCompatibility = 1.8
targetCompatibility = 1.8
repositories {
mavenCentral()
}
dependencies {
compile('org.springframework.boot:spring-boot-starter-data-jpa')
compile('org.codehaus.groovy:groovy')
compile('org.hibernate:hibernate-search:5.3.0.Final')
testCompile('com.h2database:h2')
testCompile('org.springframework.boot:spring-boot-starter-test')
}
task wrapper(type: Wrapper) {
gradleVersion = '2.8'
}
以下是Luke在运行测试后从生成的Lucene索引中显示的内容:
在我看来,电子邮件地址&#34; foo@bar.com"并没有完全存储在索引中,而是被拆分成两个字符串&#34; foo&#34;和&#34; bar.com&#34;。
&#34;入门&#34;来自官方Hibernate Search website的指南 说明
[...]标准分词器将标点字符和连字符分隔,同时保持电子邮件地址和互联网主机名不变。它是一个很好的通用标记器。 [...]
我必须在这里找不到的东西,但却无法弄清楚。
我的问题:
答案 0 :(得分:4)
似乎文档不能正确反映底层Lucene API的变化。
[K]保持电子邮件地址和互联网主机名不变......
这对于传统的StandardTokenizer
来说是正确的,因为从那时起Lucene方面已经改变了。ClassicTokenizer
。它的行为现在可以在@Entity
@Indexed
@AnalyzerDef(
name = "emailanalyzer",
tokenizer = @TokenizerDef(factory = ClassicTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
}
)
class Person {
// ...
@Field
@Analyzer(definition = "emailanalyzer")
String email;
}
中找到。
因此,以下配置应该为您提供所需内容:
{{1}}
请注意,此配置也会应用修剪。我们将相应调整HSEARCH文档,感谢您发现这一点!