使用来自Google驱动器的数据在colab中使用Fastai创建数据束

时间:2019-02-07 15:31:24

标签: python

我正在尝试使用fast.ai从我的Google云端硬盘帐户在google colab中加载数据集。 我正在使用来自here的kaggle的外星人与捕食者作为数据集 我下载并加载了我的谷歌驱动器。然后运行以下代码:

# Load the Drive helper and mount
from google.colab import drive
drive.mount("/content/drive")

%reload_ext autoreload
%autoreload 2
%matplotlib inline
from fastai import *
from fastai.vision import *

path='/content/drive/My Drive/FastaiData/Alien-vs-Predator'

tfms = get_transforms(do_flip=False)

#default bs=64, set image size=100 to run successfully on colab
data = ImageDataBunch.from_folder(path,ds_tfms=tfms, size=100)
path='/content/drive/My Drive/FastaiData/Alien-vs-Predator'

tfms = get_transforms(do_flip=False)

#default bs=64, set image size=100 to run successfully on colab
data = ImageDataBunch.from_folder(path,ds_tfms=tfms, size=100)

然后出现此错误:

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py:399: UserWarning: Your training set is empty. Is this is by design, pass `ignore_empty=True` to remove this warning.
  warn("Your training set is empty. Is this is by design, pass `ignore_empty=True` to remove this warning.")
/usr/local/lib/python3.6/dist-packages/fastai/data_block.py:402: UserWarning: Your validation set is empty. Is this is by design, use `no_split()` 
                 or pass `ignore_empty=True` when labelling to remove this warning.
  or pass `ignore_empty=True` when labelling to remove this warning.""")
IndexError                                Traceback (most recent call last)

<ipython-input-3-5b3f66a4d360> in <module>()
      4 
      5 #default bs=64, set image size=100 to run successfully on colab
----> 6 data = ImageDataBunch.from_folder(path,ds_tfms=tfms, size=100)
      7 
      8 data.show_batch(rows=3, figsize=(10,10))

/usr/local/lib/python3.6/dist-packages/fastai/vision/data.py in from_folder(cls, path, train, valid, valid_pct, classes, **kwargs)
    118         if valid_pct is None: src = il.split_by_folder(train=train, valid=valid)
    119         else: src = il.random_split_by_pct(valid_pct)
--> 120         src = src.label_from_folder(classes=classes)
    121         return cls.create_from_ll(src, **kwargs)
    122 


and so on...

似乎发现我指示为train和验证集的文件夹为空,这是不正确的。

谢谢您的帮助

1 个答案:

答案 0 :(得分:1)

将文件上传到Google驱动器后,您应该使用PyDrive。

示例代码段。

!pip install -U -q PyDrive

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# PyDrive reference:
# https://gsuitedevs.github.io/PyDrive/docs/build/html/index.html

# 2. Create & upload a file text file.
uploaded = drive.CreateFile({'title': 'Sample upload.txt'})
uploaded.SetContentString('Sample upload file content')
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))

# 3. Load a file by ID and print its contents.
downloaded = drive.CreateFile({'id': uploaded.get('id')})
print('Downloaded content "{}"'.format(downloaded.GetContentString()))

有关详细参考,请参阅https://colab.research.google.com/notebooks/io.ipynb#scrollTo=zU5b6dlRwUQk