我正在努力提高应用程序中的搜索效率。我要解决的用例是,给定一组ID,仅在那些给定的行中执行搜索。问题是该id的大小大于1024。当我使用BooleanQuery.setMaxClauseCount(Ids.size()
时,我没有得到tooManyClauses异常,但由于timeOut而不会返回任何结果。
理想情况下,我想这样做:
public ResponseEntity search(
@RequestParam(Param.SEARCH) String query,
@RequestParam(value = Param.PAGE, defaultValue = Pagination.DEFAULT_PAGE) int page,
@RequestParam(value = Param.SIZE, defaultValue = Pagination.DEFAULT_SIZE) int size)
throws Exception {
log.info("Started searching for query : " + query);
final Set<Long> accessibleIds = projectsServiceUtils
.filterIdsAccessibleByLoggedInPerson(null);
final FullTextQuery fullTextQuery = searchHelper
.getPrefixSearchQuery(Project.class, SearchFields.PROJECT_BOOST_MAP, query, accessibleRadarIds);
final List<Project> projects = fullTextQuery.setFirstResult(page * size)
.setMaxResults(size).getResultList();
return ResponseEntity.ok()
.header(Pagination.PAGINATION_PAGE, String.valueOf(page))
.header(Pagination.PAGINATION_SIZE, String.valueOf(size))
.header(Pagination.PAGINATION_COUNT, Long.toString(fullTextQuery.getResultSize()))
.body(projectSearchObjectWriter.writeValueAsString(projects));
}
以此作为我的getPrefixSearchQuery方法:
public <T> FullTextQuery getPrefixSearchQuery(
Class<T> typeClass, Map<String, Float> boostMap, String searchTerms, Set<Long> ids) {
FullTextEntityManager fullTextEntityManager = Search
.getFullTextEntityManager(entityManager);
QueryBuilder qb = fullTextEntityManager
.getSearchFactory()
.buildQueryBuilder()
.forEntity(typeClass).get();
BooleanQuery.Builder luceneQuery = new BooleanQuery.Builder();
String[] tokens = searchTerms.split("\\s+");
for (String token : tokens) {
if (!StopAnalyzer.ENGLISH_STOP_WORDS_SET.contains(token) || tokens.length == 1) {
// If search term contains only digits then search substring (possibly phone number)
final String matcher = token.toLowerCase() + "*";
final WildcardContext wildcardContext = qb.keyword().wildcard();
TermMatchingContext termMatchingContext = null;
for (String field : boostMap.keySet()) {
if (termMatchingContext != null) {
termMatchingContext = termMatchingContext.andField(field).boostedTo(boostMap.get(field));
} else {
termMatchingContext = wildcardContext.onField(field).boostedTo(boostMap.get(field));
}
}
final Query subQuery = termMatchingContext.matching(matcher).createQuery();
luceneQuery.add(subQuery, BooleanClause.Occur.MUST);
}
}
// NEW CODE TO SUPPORT FILTERING
if (ids != null) {
BooleanQuery.setMaxClauseCount(ids.size() + (tokens.length*boostMap.size()));
TermMatchingContext termMatchingContext2 = qb.keyword().wildcard().onField("id");
for (Long id : ids) {
luceneQuery.add(termMatchingContext2.matching(id).createQuery(), BooleanClause.Occur.FILTER);
}
}
FullTextQuery jpaQuery = fullTextEntityManager
.createFullTextQuery(luceneQuery.build(), typeClass);
return jpaQuery;
}
由于使用上述配置未获得任何结果,因此必须在获取查询结果后过滤我的结果。这会导致更多问题,因为我必须确保结果的大小等于传递的大小,因此,我不必遍历结果的第一页,而必须遍历整个结果集,然后对它进行分页以获得结果所需大小。 这是我目前效率很低的工作:
public ResponseEntity search(
@RequestParam(Param.SEARCH) String query,
@RequestParam(value = Param.PAGE, defaultValue = Pagination.DEFAULT_PAGE) int page,
@RequestParam(value = Param.SIZE, defaultValue = Pagination.DEFAULT_SIZE) int size,
@RequestParam(value = Param.SORT, defaultValue = SortDefault.BY_NAME) String sort)
throws IOException, IdmsException {
log.info("Started searching for query : " + query);
final FullTextQuery fullTextQuery = searchHelper
.getPrefixSearchQuery(Project.class, SearchFields.PROJECT, query);
final List<Project> projects = fullTextQuery.getResultList();
final ImmutableSet<Long> Ids = projects
.stream()
.map(Project::getId)
.collect(collectingAndThen(Collectors.toSet(),
ImmutableSet::copyOf));
final Set<Long> accessibleIds = projectsServiceUtils
.filterIdsAccessibleByLoggedInPerson(Ids);
PageRequest pageRequest = new PageRequest(
page, size);
final Page<Project> projectsFiltered = projectRepository.findByIdIn(
accessibleIds, pageRequest);
return ResponseEntity.ok()
.header(Pagination.PAGINATION_PAGE, String.valueOf(page))
.header(Pagination.PAGINATION_SIZE, String.valueOf(size))
.header(Pagination.PAGINATION_COUNT, String.valueOf(accessibleIds.size()))
.body(projectSearchObjectWriter.writeValueAsString(projectsFiltered.getContent()));
}
有什么办法可以对具有给定ID的行进行搜索并使其分页?
答案 0 :(得分:0)
if (ids != null) { BooleanQuery.setMaxClauseCount(ids.size() + (tokens.length*boostMap.size())); TermMatchingContext termMatchingContext2 = qb.keyword().wildcard().onField("id"); for (Long id : ids) { luceneQuery.add(termMatchingContext2.matching(id).createQuery(), BooleanClause.Occur.FILTER); } }
您正在使用通配符查询,但是您不需要这样做。根据ID列表的大小,这可能是您超时的原因。
尝试以下方法:
if (ids != null) {
BooleanQuery.setMaxClauseCount(ids.size() + (tokens.length*boostMap.size()));
for (Long id : ids) {
luceneQuery.add(qb.keyword().onField("id").matching(id).createQuery(), BooleanClause.Occur.FILTER);
}
}
此外,您不必在每次执行查询时都设置max子句数。实际上,您实际上不应该这样做,因为可以并行执行多个查询。
在应用程序启动时只需执行以下操作即可:
BooleanQuery.setMaxClauseCount(<some large number>);