在使用List>>类型的参数调用此word_frequencies函数时,我收到了编译错误使用类型ArrayList的参数>>我不明白为什么。
这是代码:
word_frequencies函数:
public static TObjectIntHashMap<String> word_frequencies(List<List<Pair<String, String>>> article){
TObjectIntHashMap<String> word_freqs = new TObjectIntHashMap();
for (List<Pair<String, String>> sentence : article){
for (Pair<String, String> word : sentence){
word_freqs.adjustOrPutValue(word.first(), 1, 1);
}
}
return word_freqs;
}
word_frequencies函数是包VectorSpaceModel的一部分。
这是我尝试测试部分代码的测试。
package Testing;
import Helper.VectorSpaceModel;
import java.io.*;
import java.util.regex.*;
import java.util.*;
import org.apache.commons.lang3.mutable.MutableInt;
import gnu.trove.map.hash.*;
import gnu.trove.iterator.*;
import edu.stanford.nlp.util.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.ling.CoreAnnotations.*;
import edu.stanford.nlp.ling.*;
public class VSM_tests {
public static ArrayList<ArrayList<Pair<String, String>>> tokenize(String text){
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation a = new Annotation(text);
pipeline.annotate(a);
ArrayList<ArrayList<Pair<String, String>>> article = new ArrayList<ArrayList<Pair<String, String>>>();
for (CoreMap sent : a.get(SentencesAnnotation.class)){
ArrayList<Pair<String, String>> sentence = new ArrayList<Pair<String, String>>();
for (CoreLabel l : sent.get(TokensAnnotation.class)){
String word = l.get(TextAnnotation.class);
String ner_tag = l.get(NamedEntityTagAnnotation.class);
Pair<String, String> word_ner = new Pair<String, String>(word, ner_tag);
sentence.add(word_ner);
}
article.add(sentence);
}
return article;
}
public static void main(String[] args){
File folder = new File("/Users/---/Documents/reuters/reuters/articles");
Pattern p = Pattern.compile(".+DS_Store$");
ArrayList<String> filenames = new ArrayList<String>();
for (File file_entry : folder.listFiles() ){
Matcher m = p.matcher(file_entry.getAbsolutePath());
if (!m.matches()){
filenames.add(file_entry.getAbsolutePath());
}
}
TObjectIntHashMap<String> word_freqs = new TObjectIntHashMap<String>();
MutableInt articles_processed = new MutableInt(0);
word_freqs = VectorSpaceModel.get_initial_word_freqs(filenames, articles_processed);
/*
TObjectIntIterator<String> iter = word_freqs.iterator();
String[] words = word_freqs.keySet().toArray(new String[word_freqs.size()]);
for (int i = 0; i < words.length; i++){
System.out.println(words[i] + " " + word_freqs.get(words[i]));
}
*/
String article_filename = "/Users/--/Documents/reuters/reuters/articles/us-global-technology-bitcoin-idUSKCN0UT2II";
String article_text = VectorSpaceModel.read_file(article_filename);
ArrayList<ArrayList<Pair<String, String>>> article = tokenize(article_text);
ArrayList<String> test_var = new ArrayList<String>();
TObjectIntHashMap article_freqs = VectorSpaceModel.word_frequencies(article);
}
}
将参数声明为具有接口类型,然后将参数声明为实现类的类型应该是完全正常的,但为什么在这种情况下这不适用于嵌套通用表达式?