Question

我正在尝试使用以下链接中的stanford解析器进行编码

https://gist.github.com/a34729t/2562754

我包含了所有jar文件，但它在import语句中显示错误

import edu.stanford.nlp.fsm.ExactGrammarCompactor;

谁能告诉我如何解决这个问题？我已经包含了所有jar文件，但我仍然无法弄清楚真正的问题是什么

线程中的异常＆＃34; main＆＃34; java.lang.Error：未解决的编译问题：

at pkg.stanford.Stan.main(Stan.java:39)


import edu.stanford.nlp.fsm.ExactGrammarCompactor;

import edu.stanford.nlp.io.IOUtils;
import edu.stanford.nlp.io.NumberRangeFileFilter;
import edu.stanford.nlp.io.NumberRangesFileFilter;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.objectbank.TokenizerFactory;
import edu.stanford.nlp.parser.ViterbiParser;
import edu.stanford.nlp.parser.KBestViterbiParser;
import edu.stanford.nlp.process.DocumentPreprocessor;
import edu.stanford.nlp.util.Function;
import edu.stanford.nlp.process.WhitespaceTokenizer;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.trees.international.arabic.ArabicTreebankLanguagePack;
import edu.stanford.nlp.util.Generics;
import edu.stanford.nlp.util.Numberer;
import edu.stanford.nlp.util.Pair;
import edu.stanford.nlp.util.Timing;
import edu.stanford.nlp.util.ScoredObject;

import java.io.*;
import java.text.DecimalFormat;
import java.text.NumberFormat;
import java.util.*;
import java.util.zip.GZIPOutputStream;
import java.util.*;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.parser.lexparser.LexicalizedParser;
import edu.stanford.nlp.process.PTBTokenizer;

public class RunStanfordParser {

public static void main(String[] args) throws Exception {
    // input format: data directory, and output directory
    String parserFileOrUrl=args[0];
    String fileToParse=args[1];

    LexicalizedParser lp = new LexicalizedParser(parserFileOrUrl); 
    //lp.setOptionFlags(new String[]{"-maxLength", 
"80", "-retainTmpSubcategories"}); // set max sentence length if you want

    // Call parser on files, and tokenize the contents
    FileInputStream fstream = new FileInputStream(fileToParse);
    DataInputStream in = new DataInputStream(fstream); 
    BufferedReader br = new BufferedReader(new InputStreamReader(in));
    StringReader sr; 
    PTBTokenizer tkzr; // tokenizer object
    WordStemmer ls = new WordStemmer(); // stemmer/lemmatizer object

    // Read File Line By Line
    String strLine;
    while ((strLine = br.readLine()) != null)   {
        System.out.println ("Tokenizing and Parsing: "+strLine); 


        sr = new StringReader(strLine);
        tkzr = PTBTokenizer.newPTBTokenizer(sr);
        List toks = tkzr.tokenize();
        System.out.println ("tokens: "+toks);

        Tree parse = (Tree) lp.apply(toks); 
        ArrayList<String> words = new ArrayList();
        ArrayList<String> stems = new ArrayList();
        ArrayList<String> tags = new ArrayList();

        // Get words and Tags
        for (TaggedWord tw : parse.taggedYield()){
            words.add(tw.word());
            tags.add(tw.tag());
        }

        // Get stems
        ls.visitTree(parse); // apply the stemmer to the tree
        for (TaggedWord tw : parse.taggedYield()){
            stems.add(tw.word());
        }

        // Get dependency tree
        TreebankLanguagePack tlp = new PennTreebankLanguagePack();
        GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
        GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
        Collection tdl = gs.typedDependenciesCollapsed();

        // And print!
        System.out.println("words: "+words); 
        System.out.println("POStags: "+tags); 
        System.out.println("stemmedWordsAndTags: "+stems); 
        System.out.println("typedDependencies: "+tdl); 



        System.out.println(); // separate output lines
        }

    }

}

Answer 1

您应该将两个输入参数传递给您的程序，它需要它。那个问题

String parserFileOrUrl=args[0];
String fileToParse=args[1];

预计会args[0]和args[1]。请在继续执行代码之前检查数组的长度，如

 String parserFileOrUrl = null;
 String fileToParse = null;
 if(args.length == 2){
   parserFileOrUrl=args[0];
   fileToParse=args[1];
 }else{
 System.exit(1);
 }

当两个输入没有提供给程序时，它将退出

注意：我已将System.exit代码设为1，表示发生了一些错误。

stanford解析器java代码中的错误

1 个答案: