复制命令fasttext查询并保存FastText向量

时间:2019-07-29 19:57:16

标签: python nlp fasttext

我正在使用预训练的FastText模型设置nlp预处理,以查询和保存单词向量。我遇到了FileNotFoundError: [Errno 2] No such file or directory: 'fasttext': 'fasttext',目前无法解决。

这是针对我正在从事的nlp临床文本相似性项目。我仔细检查以确保目录中存在所有文件和文件夹。我还想指出,我同时使用了floydhub和google colab来确保这不是环境问题。我经历了两次该过程,并最终遇到相同的错误。第二眼肯定会有所帮助。

复制命令fasttext print-vectors model.bin > vectors.vec的代码如下:

with open(VOCAB_FILE) as f_vocab:
    with open(OUTPUT_FILE, 'a') as f_output:
        subprocess.run(
            [FASTTEXT_EXECUTABLE, 'print-word-vectors', PRETRAINED_MODEL_FILE],
            stdin=f_vocab,
            stdout=f_output)
The traceback error I am getting is below: 

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-150-7b469ee34f75> in <module>()
      4             [FASTTEXT_EXECUTABLE, 'print-word-vectors', PRETRAINED_MODEL_FILE],
      5             stdin=f_vocab,
----> 6             stdout=f_output)

/usr/local/lib/python3.6/subprocess.py in run(input, timeout, check, *popenargs, **kwargs)
    401         kwargs['stdin'] = PIPE
    402 
--> 403     with Popen(*popenargs, **kwargs) as process:
    404         try:
    405             stdout, stderr = process.communicate(input, timeout=timeout)

/usr/local/lib/python3.6/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors)
    707                                 c2pread, c2pwrite,
    708                                 errread, errwrite,
--> 709                                 restore_signals, start_new_session)
    710         except:
    711             # Cleanup if the child failed starting.

/usr/local/lib/python3.6/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
   1342                         if errno_num == errno.ENOENT:
   1343                             err_msg += ': ' + repr(err_filename)
-> 1344                     raise child_exception_type(errno_num, err_msg, err_filename)
   1345                 raise child_exception_type(err_msg)
   1346 

FileNotFoundError: [Errno 2] No such file or directory: 'fasttext': 'fasttext'

预期结果是能够查询和保存快速文本向量。我们上方的代码段是从github存储库中获得的,并用于Kaggles Quora问题对。

1 个答案:

答案 0 :(得分:0)

必须安装快速文本才能查询和保存快速文本向量。