我正在使用python standfordcorenlp Openie。 问题是如何在没有重复结果的情况下产生最准确的三元组。
这是我从this引用的代码:
import nltk
from pycorenlp import *
import collections
nlp=StanfordCoreNLP("http://localhost:9000/")
s="On the other occasions I had binding issues it was clearly due to mis-aligned threaded and/or guide rods. Sorry again for the wordiness but I want to be clear as to the circumstances: I've had my TAZ 5 for several months. Printing has been 99% flawless. "
output = nlp.annotate(s, properties={"annotators":"tokenize,,lemma,ssplit,pos,depparse,natlog,openie",
"outputFormat": "json",
'openie.triple.strict' : 'faulse'
})
for i in range (len(output['sentences'])):
result = [output["sentences"][i]["openie"] for item in output]
for g in result:
for rel in g:
relationSent=rel['subject'],rel['relation'],rel['object']
print(relationSent)
结果如下:
('it', 'was due to', 'rods')
('it', 'was clearly due to', 'threaded rods')
('it', 'was', 'clearly due')
('it', 'was due to', 'threaded rods')
('it', 'was clearly due to', 'mis-aligned rods')
('it', 'was', 'due')
('it', 'was clearly due to', 'mis-aligned threaded rods')
('I', 'had', 'binding issues')
('it', 'was due to', 'mis-aligned rods')
('I', 'had', 'issues')
('it', 'was due to', 'mis-aligned threaded rods')
('I', 'had issues On', 'occasions')
('I', 'had issues On', 'other occasions')
('it', 'was clearly due to', 'rods')
('I', "'ve had", 'my TAZ 5')
('Printing', 'has', 'has flawless')
('Printing', 'has flawless', '99 %')
您可以看到有很多重复的答案。
如果我设置:“ openie.triple.strict”:“ True”, 'openie.max_entailments_per_clause':'1' 结果如下:
('I', "'ve had", 'my TAZ 5')
('Printing', 'has', 'has flawless')
('Printing', 'has', 'has 99 % flawless')
很明显,某些关系丢失了,但是输出重复。
我不确定是否有可能让机器理解以下句子:“在其他情况下,我遇到装订问题,这显然是由于螺纹杆和/或导向杆未对准所致。”它指的是具有约束力的问题。因此,提取的结果变为(“具有约束力的问题”,“明显由于”,“未对准的螺纹杆/导向杆”)
如果不可能,我该如何生成最多的重复关系而没有重复。
我知道这是一个漫长的问题,对于我的罗word,我感到非常抱歉。非常感谢。