OpenNLP.Net inputStreamFactory:尝试加载文件时出错

时间:2019-05-15 08:41:04

标签: c# opennlp

嗨, 我是OpenNLP.Net的新秀,在基本步骤上有些失落。

我查看了一些Java代码,并尝试在C#中将其转换,但鉴于我找不到任何C#代码,我认为我错了

现在我正在尝试运行位于主

中的代码
using System;
using System.IO;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json;
using opennlp.tools.doccat;
using opennlp.tools.tokenize;
using opennlp.tools.util;

public class Account
{
    public string Name { get; set; }
    public string Email { get; set; }
    public DateTime DOB { get; set; }
}



namespace Loading_OpenNLP
{
    class Program
    {
        static void Main(string[] args)

        {
            Account account = new Account
            {
                Name = "John Doe",
                Email = "john@microsoft.com",
                DOB = new DateTime(1980, 2, 20, 0, 0, 0, DateTimeKind.Utc),
            };
            string json = JsonConvert.SerializeObject(account, Formatting.Indented);
            Console.WriteLine(json);
            getNLPModel();
            string pause = Console.ReadLine();


        }

    static void getNLPModel()//java.io.File openNLPTraining) 
    {
        InputStreamFactory inputStreamFactory = new MarkableFileInputStreamFactory(new java.io.File("D:\\text.txt"));

        ObjectStream lineStream = new PlainTextByLineStream(inputStreamFactory, "UTF-8");
        ObjectStream sampleStream = new DocumentSampleStream(lineStream);

        }
    }
}

它可以编译,但是找不到文件...怎么了?

1 个答案:

答案 0 :(得分:0)

您可以自己实现InputStreamFactory

这里是训练自定义NER模型的F#示例,InputStreamFactory是使用F# object expressions实现的

open java.nio.charset
open java.io

#I "../packages/OpenNLP.NET/lib/"
#r "opennlp-tools-1.8.4.dll"
#r "opennlp-uima-1.8.4.dll"
open opennlp.tools.util
open opennlp.tools.namefind

let train (inputFile:string) = 
    let factory =
        { new InputStreamFactory with 
            member __.createInputStream () =
                new FileInputStream(inputFile) :> InputStream }
    let lineStream = new PlainTextByLineStream(factory, StandardCharsets.UTF_8)
    use sampleStream = new NameSampleDataStream(lineStream)
    let nameFinderFactory = new TokenNameFinderFactory()

    let trainingParameters = new TrainingParameters();
    //trainingParameters.put(TrainingParameters.ITERATIONS_PARAM, "5");
    //trainingParameters.put(TrainingParameters.CUTOFF_PARAM, "200");

    NameFinderME.train ("en", "person", sampleStream, trainingParameters, nameFinderFactory)

C#中,相同的代码可能看起来像这样

using java.nio.charset;
using java.io;
using opennlp.tools.util;
using opennlp.tools.namefind;

namespace OpenNLP.Train
{
    class MyStreamFactory: InputStreamFactory
    {
        public Factory(string fileName) => _filename = fileName;
        private readonly string _filename;

        public InputStream createInputStream()
            => new FileInputStream(_filename);
    }
    class Program
    {
        static void Main(string[] args)
        {
            var factory = new MyStreamFactory("D:\\text.txt");
            var lineStream = new PlainTextByLineStream(factory, StandardCharsets.UTF_8);
            var sampleStream = new NameSampleDataStream(lineStream);
            var nameFinderFactory = new TokenNameFinderFactory();

            var trainingParameters = new TrainingParameters();

            var model = NameFinderME.train("en", "person", sampleStream, trainingParameters, nameFinderFactory);
        }
    }
}