Question

我有以下代码，我确保其扩展名和名称是正确的。但是，我仍然得到输出的错误，如下所示。

我确实看到另一个人在Stack Overflow上问了一个类似的问题，并阅读了答案，但这对我没有帮助。

Failed to load a .bin.gz pre trained words2vecx

有任何建议如何解决这个问题？

输入：

import gensim
word2vec_path = "GoogleNews-vectors-negative300.bin.gz"
word2vec = gensim.models.KeyedVectors.load_word2vec_format(word2vec_path, binary=True)

输出：

OSError: Not a gzipped file (b've')

Answer 1

问题是您下载的文件不是gzip文件。如果您检查文件的大小，则可能以KB为单位（当我从this Github link下载该文件时，因为我需要git-lfs，这就是我发生的事情）

以下是解决此问题的另一种方法：

在终端上使用以下命令下载模型：

wget -c "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz"

然后，像使用gensim一样加载模型：

from gensim import models

w = models.KeyedVectors.load_word2vec_format(
    'GoogleNews-vectors-negative300.bin', binary=True)

希望这对您有帮助！

Answer 2

尝试一下

import tensorflow
word2vec_path = 'https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz'
word2vec = models.KeyedVectors.load_word2vec_format(word2vec_path, binary=True)

OSError：不是gzip文件（b＆＃39; ve＆＃39;）python

2 个答案: