我仅使用ner管道创建了一个伪造模型来检测许多实体。例如:custom-entity-1,custom-entity-2,custom-entity-3。现在,我要部署它。培训类似于显示的here
直接将模型文件夹添加到代码中,在尝试将其克隆并复制到远程git存储库时会引起问题。具体而言,似乎使用不同的编码来修改某些文件。
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\NBarnab\AppData\Local\Programs\Python\Python36\lib\site-packages\spacy\__init__.py", line 21, in load
return util.load_model(name, **overrides)
File "C:\Users\NBarnab\AppData\Local\Programs\Python\Python36\lib\site-packages\spacy\util.py", line 116, in load_model
return load_model_from_path(Path(name), **overrides)
File "C:\Users\NBarnab\AppData\Local\Programs\Python\Python36\lib\site-packages\spacy\util.py", line 156, in load_model_from_path
return nlp.from_disk(model_path)
File "C:\Users\NBarnab\AppData\Local\Programs\Python\Python36\lib\site-packages\spacy\language.py", line 647, in from_disk
util.from_disk(path, deserializers, exclude)
File "C:\Users\NBarnab\AppData\Local\Programs\Python\Python36\lib\site-packages\spacy\util.py", line 511, in from_disk
reader(path / key)
File "C:\Users\NBarnab\AppData\Local\Programs\Python\Python36\lib\site-packages\spacy\language.py", line 636, in <lambda>
('tokenizer', lambda p: self.tokenizer.from_disk(p, vocab=False)),
File "tokenizer.pyx", line 367, in spacy.tokenizer.Tokenizer.from_disk
File "tokenizer.pyx", line 402, in spacy.tokenizer.Tokenizer.from_bytes
File "C:\Users\NBarnab\AppData\Local\Programs\Python\Python36\lib\site-packages\spacy\util.py", line 490, in from_bytes
msg = msgpack.loads(bytes_data, raw=False)
File "C:\Users\NBarnab\AppData\Local\Programs\Python\Python36\lib\site-packages\msgpack_numpy.py", line 184, in unpackb
return _unpackb(packed, **kwargs)
File "msgpack\_unpacker.pyx", line 200, in msgpack._unpacker.unpackb
TypeError: unhashable type: 'list'
python -m spacy download en_core_web_sm
已下载spacy的预训练模型,有没有这样干净的部署自定义模型的方法?