我正在尝试生成所有可能的句子组合。作为变量我有两个字符串,一个字符串将成为主题,例如health
,其中一个将是一个对象,例如fruit
,但是我会有一个List<String>
值与一个“头”字相关联,因此与上面提到的两个组件保持一致,它们将与列表[improve, change, alter, modify]
相关联。我想生成这些句子的所有可能组合,并将每个句子添加到List<Sentences>
,例如:
Sentence example_sentence = new Sentence(verb, object, subject);
sentences.add(example_sentence);
现在,发生这种情况的更大功能如下所示:
public Sentence dataPreprocessing(String raw_subject, String raw_object, String raw_verb, List<Sentence> sentences) throws IOException {
WordNet wordnet = new WordNet();
String verb = wordnet.getStem(raw_verb);
String object = wordnet.getStem(raw_object);
String subject = wordnet.getStem(raw_subject);
List<String> verb_hypernym_container = new ArrayList<>();
verb_hypernym_container = wordnet.getHypernyms(verb, POS.VERB);
//wordnet.getHypernyms(this.object, POS.NOUN);
//wordnet.getHypernyms(this.subject, POS.NOUN);
Sentence return_sentence = new Sentence( verb, object, subject );
return return_sentence;
}
如何才能最有效地实现产生所有可能句子的目标?
答案 0 :(得分:2)
由于您有固定数量的列表,最简单的方法是使用嵌套循环:
List<Sentence> sentences = new ArrayList<>();
for(String verb_hypernym : wordnet.getHypernyms(verb, POS.VERB))
for(String object_hypernym : wordnet.getHypernyms(object, POS.NOUN))
for(String subject_hypernym : wordnet.getHypernyms(subject, POS.NOUN))
sentences.add(new Sentence(verb_hypernym, object_hypernym, subject_hypernym));
return sentences;
或者,为避免更频繁地调用getHypernyms
:
List<String> verb_hypernyms = wordnet.getHypernyms(verb, POS.VERB);
List<String> object_hypernyms = wordnet.getHypernyms(object, POS.NOUN);
List<String> subject_hypernyms = wordnet.getHypernyms(subject, POS.NOUN);
for(String verb_hypernym : verb_hypernyms)
for(String object_hypernym : object_hypernyms)
for(String subject_hypernym : subject_hypernyms)
sentences.add(new Sentence(verb_hypernym, object_hypernym, subject_hypernym));
return sentences;
答案 1 :(得分:1)
一旦你有一个名词和动词列表,你可以使用流来返回一个句子列表。这也使您有机会删除任何重复项,排序或您需要对流做的任何其他事情。
List<Sentence> sentences = subjectList.stream()
.flatMap(object -> verbList.stream()
.flatMap(verb -> objectList.stream()
.map(subject -> new Sentence(object, verb, subject))))
.distinct()
.collect(Collectors.toList());