将Google云端硬盘中的图像数据集(文件夹或邮编)加载到Google Colab?

时间:2018-03-18 17:48:28

标签: python neural-network google-colaboratory

我的Google云端硬盘上有图像数据集。我在压缩的.zip版本和未压缩的文件夹中都有这个数据集。

我想使用Google Colab培训CNN。如何告诉Colab我的Google云端硬盘中的图片在哪里?

  1. official tutorial does not help me as it only shows how to upload single files, not a folder with 10000 images as in my case.

  2. Then I found this answer, but the solution is not finished, or at least I did not understand how to go on from unzipping. Unfortunately I am unable to comment this answer as I don't have enough "stackoverflow points"

  3. I also found this thread, but here all the answer use other tools, such as Github or dropbox

  4. 我希望有人可以向我解释我需要做什么或告诉我在哪里寻求帮助。

    Edit1:

    I have found yet another thread asking the same question as mine:可悲的是,在3个答案中,有两个是Kaggle,我不知道也不使用。第三个答案提供了两个链接。第一个链接指向我链接的第三个线程,第二个链接仅解释如何手动上传单个文件。

3 个答案:

答案 0 :(得分:8)

更新答案。您现在可以在Google Colab中完成

# Load the Drive helper and mount
from google.colab import drive

# This will prompt for authorization.
drive.mount('/content/drive')

!ls "/content/drive/My Drive"

Google Documentation

答案 1 :(得分:6)

如@yl_low here

所述

第1步:

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse

第2步:

from google.colab import auth
auth.authenticate_user()

第3步:

from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

第2步和第3步都需要填写网址

提供的验证码

第4步:

!mkdir -p drive
!google-drive-ocamlfuse drive

第5步:

print('Files in Drive:')
!ls drive/

答案 2 :(得分:1)

其他答案非常好,但是它们每次都需要在Google云端硬盘中进行身份验证,如果您想从头到尾运行笔记本电脑,那会很不舒服。

我也有同样的需求,我想将一个包含数据集的zip文件从云端硬盘下载到Colab。我更喜欢获取该文件的可共享链接并运行以下单元格(用共享链接替换drive_url):

import urllib

drive_url = 'https://drive.google.com/uc?export=download&id=1fBVMX66SlvrYa0oIau1lxt1_Vy-XYZWG'
file_name = 'downloaded.zip'

urllib.request.urlretrieve(drive_url, file_name)
print('Download completed!')