我将stanford coreNLP与python pycorenlp一起使用来解析句子,我想提取依赖项解析。但是,当我检查结果是否返回给我时,发现它忽略了下划线“ _”而拆分了实体字。
例如: 实体“ lauren_conrad” 分为三个词: lauren,_,conrad 。
这是文本和代码的示例
from stanfordcorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP(r'C:\Users\Liao\Desktop\stanford-corenlp-full-2018-10-05')
sentence ="last november , lauren_conrad , 19 , attended a celebrity-filled function thrown by teen people magazine , a perk of being a central character on mtv 's popular reality series '' laguna_beach : the real orange county ."
depend_path = nlp.dependency_parse(sentence)
print(depend_path)
nlp.close()
输出如下:
root(ROOT-0, attended-10)
amod(november-2, last-1)
nmod:tmod(attended-10, november-2)
punct(attended-10, ,-3)
compound(conrad-6, lauren-4)
compound(conrad-6, _-5)
nsubj(attended-10, conrad-6)
punct(conrad-6, ,-7)
amod(conrad-6, 19-8)
punct(conrad-6, ,-9)
det(function-13, a-11)
amod(function-13, celebrity-filled-12)
dobj(attended-10, function-13)
acl(function-13, thrown-14)
case(magazine-18, by-15)
amod(magazine-18, teen-16)
compound(magazine-18, people-17)
nmod:by(thrown-14, magazine-18)
punct(magazine-18, ,-19)
det(perk-21, a-20)
appos(magazine-18, perk-21)
mark(character-26, of-22)
cop(character-26, being-23)
det(character-26, a-24)
amod(character-26, central-25)
acl(perk-21, character-26)
case(series-32, on-27)
nmod:poss(series-32, mtv-28)
case(mtv-28, 's-29)
amod(series-32, popular-30)
compound(series-32, reality-31)
nmod:on(character-26, series-32)
punct(character-26, ''-33)
compound(beach-36, laguna-34)
compound(beach-36, _-35)
dep(character-26, beach-36)
punct(character-26, :-37)
det(county-41, the-38)
amod(county-41, real-39)
compound(county-41, orange-40)
dep(character-26, county-41)
punct(attended-10, .-42)