Question

我正在使用NLTK来提取PERSON和ORGANIZATION之间的关系。

另外，我想提取ORGANIZATION和LOCATION之间的关系。 NLTK版本是3.2.1。

我使用了词性标注和命名实体识别（NER）。还为NER结果绘制了解析树但我无法从该句中提取所提到的关系。

以下是代码：

import nltk, re
from nltk import word_tokenize

sentence = "Mark works at JPMC in London every day"
pos_tags = nltk.pos_tag(word_tokenize(sentence))            # POS tagging of the sentence
ne = nltk.ne_chunk(pos_tags)                                # Named Entity Recognition
ne.draw()                                                   # Draw the Parse Tree

IN = re.compile(r'.*\bin\b(?!\b.+ing)')
for rel1 in nltk.sem.extract_rels('PER', 'ORG', pos_tags, pattern = IN):
    print(nltk.sem.rtuple(rel1))
for rel2 in nltk.sem.extract_rels('ORG', 'LOC', pos_tags, pattern = IN):
    print(nltk.sem.rtuple(rel2))

如何提取＆＃39;人员 - 组织＆＃39; 关系和＆＃39;组织 - 位置＆＃39; 关系？

Answer 1

我认为文档没有标记pos，它应该是NE。

工作代码

senten = "Mark works in JPMC in London every day"
pos_tags = nltk.pos_tag(word_tokenize(senten))  # POS tagging of the sentence
ne = nltk.ne_chunk(pos_tags)  # Named Entity Recognition

chunked = nltk.ne_chunk_sents(pos_tags, binary=True)
# ne.draw()  # Draw the Parse Tree


print(pos_tags)

IN = re.compile(r'.*\bin\b(?!\b.+ing)')

for rel in nltk.sem.extract_rels('PERSON', 'ORGANIZATION', ne, corpus='ace', pattern=IN):
    print(nltk.sem.rtuple(rel))

<强>输出

[PER：＆＃39; Mark / NNP＆＃39;]＆＃39;在/ IN＆＃39;中工作/ VBZ [ORG：＆＃39; JPMC / NNP＆＃39;]

从NLTK中的句子中提取关系

1 个答案: