阅读Glove词嵌入实现的文本语料库中的问题

时间:2017-08-09 07:29:45

标签: python nlp word-embedding

我正在尝试使用Glove实现来训练我的文本语料库,正如此链接https://github.com/hans/glove.py/blob/master/glove.py.I中所提到的那样创建了由单个空格分隔的文本语料库。文件的大小为3.6 GB。我正在获取我尝试加载文件时出现此错误。

2017-08-09 12:51:47,848 Fetching vocab..
Traceback (most recent call last):
  File "Glove_python_bbc.py", line 378, in <module>
    main(parse_args())
  File "Glove_python_bbc.py", line 347, in main
    vocab = get_or_build(arguments.vocab_path, build_vocab, corpus)
  File "Glove_python_bbc.py", line 83, in get_or_build
    obj = msgpack.load(obj_f, use_list=False, encoding='utf-8')
  File "msgpack\_unpacker.pyx", line 164, in msgpack._unpacker.unpack (msgpack/_unpacker.cpp:2622)
  File "msgpack\_unpacker.pyx", line 143, in msgpack._unpacker.unpackb (msgpack/_unpacker.cpp:2143)
msgpack.exceptions.ExtraData: unpack(b) received extra data.

帮我看一下文件。谢谢

0 个答案:

没有答案