Question

我有来自eBrevia的经过训练的名称实体识别（NER）模型。我想知道是否有一种方法可以使用Python或Java将其加载到CoreNLP或Spacy中。

编辑：如果预训练的模型是泡菜模型，是否可以使用Corenlp或Spacy加载它？

提前谢谢！

Answer 1

使用spaCy（Python），您应该能够编写一个自定义组件，并在其中实现当前NER模型的包装器。自定义组件始终将doc作为输入，对其进行修改并返回。这样就可以链接定制组件和“预制”组件。

例如，如果您的NER模型将令牌列表作为输入，并返回其令牌BILUO tags的列表，则可以这样包装该模型：

from spacy.gold import offsets_from_biluo_tags

def custom_ner_wrapper(doc):
    words = [token.text for token in doc]
    custom_entities = your_custom_ner_model(words)    
    doc.ents = spans_from_biluo_tags(doc, custom_entities)    
    return doc

一旦定义了名为custom_ner_wrapper的自定义管道组件，就必须将其添加到nlp管道中，如下所示：

nlp.add_pipe(custom_ner_wrapper)

更多信息可以在这里找到：https://spacy.io/usage/processing-pipelines#wrapping-models-libraries

无论如何，我可以将在eBrevia中训练的模型加载到CoreNLP或Spacy吗？

1 个答案: