谷歌Colab上传单词嵌入

时间:2018-02-24 05:09:18

标签: nlp word-embedding google-colaboratory

我使用Google Colab作为我的DL模型(NLP),我上传并导入了我的训练数据(屏幕截图),现在我想预先训练GloVe字嵌入。如果我以相同的方式上传,我猜需要几个小时,即便如此,我也不确定它是否有效。

有没有人遇到同样的问题?

由于

上传培训数据 screenshot: uploading training data

2 个答案:

答案 0 :(得分:2)

尝试直接忘记

!wget http://nlp.stanford.edu/data/glove.6B.zip

答案 1 :(得分:0)

我将文件上传到我的驱动器上,然后从谷歌驱动器中导入

from googleapiclient.discovery import build
drive_service = build('drive', 'v3')
file_id = 'your file id '

import io
from googleapiclient.http import MediaIoBaseDownload

request = drive_service.files().get_media(fileId=file_id)
downloaded = io.BytesIO()
downloader = MediaIoBaseDownload(downloaded, request)
done = False
while done is False:
  # _ is a placeholder for a progress object that we ignore.
  # (Our file is small, so we skip reporting progress.)
  _, done = downloader.next_chunk()

downloaded.seek(0)
glove=format(downloaded.read())