Question

Solr's wiki page上的示例显示了一种索引层次结构节点：

Doc#1: 0/NonFic, 1/NonFic/Law
Doc#2: 0/NonFic, 1/NonFic/Sci
Doc#3: 0/NonFic, 1/NonFic/Hist

如何将我的路径编入索引以实现此目的？我是否手动拆分路径，计算节点，自己生成这些术语并将它们作为数组存储在Solr（multiValued字段）中，还是可以配置Solr's path hierarchy tokenizer来应用索引本身？

作为参考，我想要生成这样的路径：

public class DocumentPathBuilder {

    private List<String> nodes = new ArrayList<>();

    public static DocumentPathBuilder newInstance() {
        return new DocumentPathBuilder();
    }

    public static String escapeText(String input) {
        if (input == null)
            throw new NullPointerException("Cannot escape null input!");
        return input.replaceAll(ESearchDocumentPath.HIERARCHY_SEPERATOR, "").toUpperCase().trim();
    }

    public DocumentPathBuilder add(String node) {
        nodes.add(escapeText(node));
        return this;
    }

    public DocumentPathBuilder add(Collection<String> nodes) {
        this.nodes.addAll(nodes.stream()
                .map(n->escapeText(n))
                .collect(Collectors.toList())
        );
        return this;
    }

    public List<String> build() {
        List<String> result = new ArrayList<>();
        for (int i = 0; i < nodes.size(); i++) {
            StringJoiner joiner = new StringJoiner(ESearchDocumentPath.HIERARCHY_SEPERATOR);
            joiner.add(""+i);
            for (int j = 0; j <= i; j++) {
                joiner.add(nodes.get(j));
            }
            result.add(joiner.toString()+ESearchDocumentPath.HIERARCHY_SEPERATOR);
        }
        return result;
    }
}

示例输入：

  List<String> build = DocumentPathBuilder.newInstance()
                .add("A")
                .add("350")
                .add(Arrays.asList("350-01", "FIGUTZRg"))
                .build();

输出条目：

0 = "0>A>"
1 = "1>A>350>"
2 = "2>A>350>350-01>"
3 = "3>A>350>350-01>FIGUTZRG>"

另外，有什么区别？如果我将生成的值存储在multiValued字段中，我是否会获得相同的结果如果Solr会使用路径标记生成器生成它？

Answer 1

从您引用的页面

您必须对此展平数据执行一些索引时间处理，以便创建facet.prefix方法所需的标记。当我们索引数据时，我们创建特殊格式的术语，这些术语编码作为路径一部分出现的每个节点的深度信息，并包括由公共分隔符分隔的层次结构（“深度/第一级术语/第二级术语/等”）。我们还为原始数据中的每个祖先添加了附加条款。

因此，这不包含在Path Hierarchy Tokenizer中，其中示例还显示了生成的标记的外观（并且没有n值）：

<fieldType name="text_path" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <tokenizer class="solr.PathHierarchyTokenizerFactory" delimiter="\" replace="/"/>
  </analyzer>
</fieldType>

在：＆＃34; c：\ usr \ local \ apache＆＃34;

Out ：＆＃34; c：＆＃34;，＆＃34; c：/ usr＆＃34;，＆＃34; c：/ usr / local＆＃34;，＆＃34; C：在/ usr /本地/ Apache＆＃34;

Solr层次结构编号

1 个答案: