当我使用以下代码上传数据时,一旦断开连接,数据就会消失。
from google.colab import files
uploaded = files.upload()
for fn in uploaded.keys():
print('User uploaded file "{name}" with length {length} bytes'.format(
name=fn, length=len(uploaded[fn])))
请建议我上传数据的方法,以便即使在数天断开连接后数据仍保持完整。
答案 0 :(得分:1)
我将数据永久保存在google驱动器中的.zip文件中,并使用以下代码将其上传到google colabs VM。
将其粘贴到单元格中,然后更改file_id。您可以在google drive中找到文件网址中的file_id。 (右键单击文件 - >获取可共享链接 - >打开后找到URL的一部分?id =)
#@title uploader
file_id = "1BuM11fJJ1qdZH3VbQ-GwPlK5lAvXiNDv" #@param {type:"string"}
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
# PyDrive reference:
# https://googledrive.github.io/PyDrive/docs/build/html/index.html
from google.colab import auth
auth.authenticate_user()
from googleapiclient.discovery import build
drive_service = build('drive', 'v3')
# Replace the assignment below with your file ID
# to download a different file.
#
# A file ID looks like: 1gLBqEWEBQDYbKCDigHnUXNTkzl-OslSO
import io
from googleapiclient.http import MediaIoBaseDownload
request = drive_service.files().get_media(fileId=file_id)
downloaded = io.BytesIO()
downloader = MediaIoBaseDownload(downloaded, request)
done = False
while done is False:
# _ is a placeholder for a progress object that we ignore.
# (Our file is small, so we skip reporting progress.)
_, done = downloader.next_chunk()
fileId = drive.CreateFile({'id': file_id }) #DRIVE_FILE_ID is file id example: 1iytA1n2z4go3uVCwE_vIKouTKyIDjEq
print(fileId['title'])
fileId.GetContentFile(fileId['title']) # Save Drive file as a local file
!unzip {fileId['title']}
答案 1 :(得分:0)
在GDrive中保存数据很好(@skaem)。
如果您的数据包含代码,我建议您在colab笔记本的开头简单git clone
来自Github(或任何其他代码版本控制服务)的源存储库。
通过这种方式,您可以离线开发,并在需要时使用最新代码在云中执行实验。