方法word_frequencies(List <list <pair <string,string>&gt;&gt;)不适用于参数(ArrayList <arraylist <pair <string,string>&gt;&gt;)

时间:2016-03-22 20:08:37

标签: java oop arraylist stanford-nlp

在使用List&gt;&gt;类型的参数调用此word_frequencies函数时,我收到了编译错误使用类型ArrayList的参数&gt;&gt;我不明白为什么。

这是代码:

word_frequencies函数:

public static TObjectIntHashMap<String> word_frequencies(List<List<Pair<String, String>>> article){
    TObjectIntHashMap<String> word_freqs = new TObjectIntHashMap();
    for (List<Pair<String, String>> sentence : article){
        for (Pair<String, String> word : sentence){
            word_freqs.adjustOrPutValue(word.first(), 1, 1);
        }
    }
    return word_freqs;
}

word_frequencies函数是包VectorSpaceModel的一部分。

这是我尝试测试部分代码的测试。

package Testing;

import Helper.VectorSpaceModel;
import java.io.*;
import java.util.regex.*;
import java.util.*;
import org.apache.commons.lang3.mutable.MutableInt;
import gnu.trove.map.hash.*;
import gnu.trove.iterator.*;
import edu.stanford.nlp.util.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.ling.CoreAnnotations.*;
import edu.stanford.nlp.ling.*;

public class VSM_tests {

    public static ArrayList<ArrayList<Pair<String, String>>> tokenize(String text){
        Properties props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner");
        StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
        Annotation a = new Annotation(text);
        pipeline.annotate(a);
        ArrayList<ArrayList<Pair<String, String>>> article = new ArrayList<ArrayList<Pair<String, String>>>();
        for (CoreMap sent : a.get(SentencesAnnotation.class)){
            ArrayList<Pair<String, String>> sentence = new ArrayList<Pair<String, String>>();
            for (CoreLabel l : sent.get(TokensAnnotation.class)){
                String word = l.get(TextAnnotation.class);
                String ner_tag = l.get(NamedEntityTagAnnotation.class);
                Pair<String, String> word_ner = new Pair<String, String>(word, ner_tag);
                sentence.add(word_ner);
            }
            article.add(sentence);
        }
        return article;
    }



    public static void main(String[] args){
        File folder = new File("/Users/---/Documents/reuters/reuters/articles");
        Pattern p = Pattern.compile(".+DS_Store$");
        ArrayList<String> filenames = new ArrayList<String>();
        for (File file_entry : folder.listFiles() ){
            Matcher m = p.matcher(file_entry.getAbsolutePath());
            if (!m.matches()){
                filenames.add(file_entry.getAbsolutePath());
            }
        }
        TObjectIntHashMap<String> word_freqs = new TObjectIntHashMap<String>();
        MutableInt articles_processed = new MutableInt(0);
        word_freqs = VectorSpaceModel.get_initial_word_freqs(filenames, articles_processed);
        /*
        TObjectIntIterator<String> iter = word_freqs.iterator();
        String[] words = word_freqs.keySet().toArray(new String[word_freqs.size()]);
        for (int i = 0; i < words.length; i++){
            System.out.println(words[i] + " " + word_freqs.get(words[i]));
        }
        */
        String article_filename = "/Users/--/Documents/reuters/reuters/articles/us-global-technology-bitcoin-idUSKCN0UT2II";
        String article_text = VectorSpaceModel.read_file(article_filename);
        ArrayList<ArrayList<Pair<String, String>>> article = tokenize(article_text);
        ArrayList<String> test_var = new ArrayList<String>();
        TObjectIntHashMap article_freqs = VectorSpaceModel.word_frequencies(article);
    }
}

将参数声明为具有接口类型,然后将参数声明为实现类的类型应该是完全正常的,但为什么在这种情况下这不适用于嵌套通用表达式?

0 个答案:

没有答案