我正在尝试运行StanfordCoreNLP解析器,并且我有以下代码:
from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000')
def depparse(text):
parsed=""
output = nlp.annotate(text, properties={
'annotators': 'depparse',
'outputFormat': 'json'
})
for i in output["sentences"]:
for j in i["basicDependencies"]:
parsed=parsed+str(j["dep"]+'('+ j["governorGloss"]+' ')+str(j["dependentGloss"]+')'+' ')
return parsed
text='I shot an elephant in my sleep'
depparse(text)
这给我输出为:
'ROOT(ROOT shot) nsubj(shot I) det(elephant an) dobj(shot elephant) case(sleep in) nmod:poss(sleep my) nmod(shot sleep) '
要将关系转换为树,我遇到了一个stackoverflow帖子Stanford NLP parse tree format。但是,解析器的输出在“包围式解析(树)”中。因此,我不确定如何实现。我也尝试更改输出格式,但它给出了错误。
我还找到了Python - Generate a dictionary(tree) from a list of tuples并实现了
list_of_tuples = [('ROOT','ROOT', 'shot'),('nsubj','shot', 'I'),('det','elephant', 'an'),('dobj','shot', 'elephant'),('case','sleep', 'in'),('nmod:poss','sleep', 'my'),('nmod','shot', 'sleep')]
nodes={}
for i in list_of_tuples:
rel,parent,child=i
nodes[child]={'Name':child,'Relationship':rel}
forest=[]
for i in list_of_tuples:
rel,parent,child=i
node=nodes[child]
if parent=='ROOT':# this should be the Root Node
forest.append(node)
else:
parent=nodes[parent]
if not 'children' in parent:
parent['children']=[]
children=parent['children']
children.append(node)
print forest
我得到以下输出[{'Name': 'shot', 'Relationship': 'ROOT', 'children': [{'Name': 'I', 'Relationship': 'nsubj'}, {'Name': 'elephant', 'Relationship': 'dobj', 'children': [{'Name': 'an', 'Relationship': 'det'}]}, {'Name': 'sleep', 'Relationship': 'nmod', 'children': [{'Name': 'in', 'Relationship': 'case'}, {'Name': 'my', 'Relationship': 'nmod:poss'}]}]}]
答案 0 :(得分:0)
确实有点题外话(这并不是您最初提出的问题的答案,而是您最后的评论)。将其发布为答案,因为该代码实际上无法很好地放入注释中。但是只需稍微更改depparse函数,就可以使用所需格式:
def depparse(text):
parsed=""
output = nlp.annotate(text, properties={
'annotators': 'depparse',
'outputFormat': 'json'
})
for i in output['sentences']: # not sure if there can be multiple items here. If so, it just returns the first one currently.
return [tuple((dep['dep'], dep['governorGloss'], dep['dependentGloss'])) for dep in i['basicDependencies']]