Question

我正在学习ravenDb，而我正试图用lucene的定制分析仪来充分发挥它的作用。

根据docs -

您要引用的分析器必须可供RavenDB服务器实例使用。使用默认Lucene.NET发行版未附带的分析器时，需要将所有必需的DLL放入RavenDB服务器目录的“Analyzers”文件夹中，并使用其完全限定的类型名称（包括程序集名称）。 / p>

看起来很简单，甚至过于简单，但我试着没有运气。我使用此代码（CustomAnalyzers项目）在CustomAnalyzer中实现NGramAnalyzer（和过滤器）：

[这个具体的实现并不重要，因为现在我正在尝试使用任何自定义分析器，然后继续。如果有帮助，我还是把它包括在内。]

    public class NGramTokenFilter : TokenFilter
{
    public static int DEFAULT_MIN_NGRAM_SIZE = 1;
    public static int DEFAULT_MAX_NGRAM_SIZE = 2;
    private int minGram, maxGram;
    private char[] curTermBuffer;
    private int curTermLength;
    private int curGramSize;
    private int curPos;
    private int tokStart;
    private TermAttribute termAtt;
    private OffsetAttribute offsetAtt;
    public NGramTokenFilter(TokenStream input, int minGram, int maxGram)
        : base(input)
    {

        if (minGram < 1)
        {
            throw new System.ArgumentException("minGram must be greater than zero");
        }
        if (minGram > maxGram)
        {
            throw new System.ArgumentException("minGram must not be greater than maxGram");
        }
        this.minGram = minGram;
        this.maxGram = maxGram;

        this.termAtt = AddAttribute<TermAttribute>();
        this.offsetAtt = AddAttribute<OffsetAttribute>();
    }

    public NGramTokenFilter(TokenStream input)
        : this(input, DEFAULT_MIN_NGRAM_SIZE, DEFAULT_MAX_NGRAM_SIZE)
    {
    }

    public override bool IncrementToken()
    {
        while (true)
        {
            if (curTermBuffer == null)
            {
                if (!input.IncrementToken())
                {
                    return false;
                }
                else
                {
                    curTermBuffer = (char[])termAtt.TermBuffer().Clone();
                    curTermLength = termAtt.TermLength();
                    curGramSize = minGram;
                    curPos = 0;
                    tokStart = offsetAtt.StartOffset;
                }
            }
            while (curGramSize <= maxGram)
            {
                while (curPos + curGramSize <= curTermLength)
                {     // while there is input
                    ClearAttributes();
                    termAtt.SetTermBuffer(curTermBuffer, curPos, curGramSize);
                    offsetAtt.SetOffset(tokStart + curPos, tokStart + curPos + curGramSize);
                    curPos++;
                    return true;
                }
                curGramSize++;                         // increase n-gram size
                curPos = 0;
            }
            curTermBuffer = null;
        }
    }
    public override void Reset()
    {
        base.Reset();
        curTermBuffer = null;
    }
}
public class NGramAnalyzer : Analyzer
{
    public override TokenStream TokenStream(string fieldName, TextReader reader)
    {
        var tokenizer = new StandardTokenizer(Version.LUCENE_29, reader) { MaxTokenLength = 255 };
        TokenStream filter = new StandardFilter(tokenizer);
        filter = new LowerCaseFilter(filter);
        filter = new StopFilter(false, filter, StandardAnalyzer.STOP_WORDS_SET);
        return new NGramTokenFilter(filter, 2, 6);
    }
}

并将dll（类库中的所有dll）添加到Analyzers目录（分析器不存在，所以我添加了新文件夹，找不到任何其他分析器文件夹......）

在另一个项目中（引用'CustomAnalyzers'项目）我正在尝试构建索引：

public class NGramIndex : AbstractIndexCreationTask<Book>
{
    public NGramIndex()
    {
        Map = books => from book in books
                        select new
                        {
                            book.Body
                        };

        Indexes.Add(x => x.Body, FieldIndexing.Analyzed);
        Analyzers.Add(x => x.Body, typeof(NGramAnalyzer).FullName);
    }
}

当我运行此代码时

var store = new DocumentStore { Url = "MY_URL", DefaultDatabase = "MY_DB" }.Initialize();
        new NGramIndex().Execute(store);

我得到了这个例外 -

'Raven.Abstractions.Exceptions.IndexCompilationException'发生在mscorlib.dll中，但未在用户代码中处理附加信息：无法找到分析器类型'CustomAnalyzers.NGramAnalyzer，CustomAnalyzers，Version = 1.0.0.0，Culture = neutral，PublicKeyToken = null'for field：Body

我还尝试使用'AssemblyQualifiedName'或硬编码全名。我查看了这些stackoverflow questinos： 1 2 并且找不到答案。

和this以及this 并试图重新启动ravendb。

请说明您在ravendb中如何使用自定义分析器。 THX。

RavenDb - 如何使用Custom Analyzer（NGram）

0 个答案: