如何在python中使用Stanford NER对齐具有相同命名实体类别的两个句子?

时间:2017-05-28 09:44:28

标签: python stanford-nlp named-entity-recognition

我将在python中使用stanford NER对齐包含相同NER类别('PERSON','LOCATION','ORGANIZATION')的两个不同句子。

sentence1 = John is reading a booklet in London Library
sentence2 = Michael reads a brochure in England Library

我想要的结果如下:

result= [[[u'John', u'Person'],[u'Michael',u'PERSON']] , [[u'London',u'LOCATION],['u'England',u'LOCATION']]]

我尝试过我的代码

def alignNER(self, sent1, sent2):
    java.path = 'C:/program files'/java/jre/bin/java.exe'
    os.environ ['JAVAHOME'] = java.path
    st = StanfordNERTagger('usr/english.all.3class.distsim.crf.ser','usr/stanford-ner.jar')

    result = []
    for i in xrange(len(sent1)):
        if sent1[i]:
            word = sent1[i]
            temp = st.tag(word.split())
            for token, tag in temp:
                if tag in ['PERSON','LOCATION','ORGANIZATION']:
                    result.append([temp,i])
    return result

    for j in xrange(len(sent2)):
        if sent2[j]:
            word = sent2[j]
            temp = st.tag(word.split())
            for token, tag in temp:
                if tag in ['PERSON','LOCATION','ORGANIZATION']:
                    result.append([temp,j])
    return result

当我称之为

checkNER = alignNER(sent1, sent2)
print checkNER

结果只显示带有检测到的类别的位置索引的句子1:

[[[u'John', u'Person'],0],[[u'London',u'LOCATION],6]]

任何人都可以提供帮助吗?感谢

0 个答案:

没有答案