从驱动器下载大文件到colab

时间:2020-07-27 22:22:23

标签: shell google-drive-api google-colaboratory

我有一个公共Google Drive托管文件的链接:

https://drive.google.com/uc?id=19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8&export=download

下面是一个.sh脚本,可用于其他文件和链接:

#!/usr/bin/env bash
function gdrive_download () { # credit to https://github.com/ethanjperez/convince
  CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=$1" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')
  wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$CONFIRM&id=$1" -O $2
  rm -rf /tmp/cookies.txt
}

mkdir -p Models/real-fixed-cam Models/real-hand-held
gdrive_download 1yiNsSkPYoBZ55fSQ1iwb1io9QL_PcR2i Models/real-fixed-cam/netG_epoch_12.pth
gdrive_download 13HckO9fPAKYocdB_CAC5n8uyM3xQ2MpG Models/real-hand-held/netG_epoch_12.pth

上面的脚本在Colab中是这样调用的:

!wget https://gist.githubusercontent.com/andreyryabtsev/458f7450c630952d1e75e195f94845a0/raw/0b4336ac2a2140ac2313f9966316467e8cd3002a/download.sh
!chmod +x download.sh
!./download.sh

我已对其进行了如下调整以满足自己的需求:

#!/usr/bin/env bash
function gdrive_download () { # credit to https://github.com/ethanjperez/convince
  CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=$1" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')
  wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$CONFIRM&id=$1" -O $2
  rm -rf /tmp/cookies.txt
}

mkdir -p pix2pix/checkpoint
gdrive_download 19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8 pix2pix/checkpoint/weights.zip

上面的代码在colab中通过以下方式调用:

!wget https://gist.githubusercontent.com/Daryl149/070397c9cb3539f5cd01173f6c44200d/raw/207a76e94e70e6c9334f48c25b4998f4fd1b95e3/download.sh
!chmod +x download.sh
!./download.sh

文件夹已正确创建。但是,它不是将500mb +的zip文件下载到checkpoints文件夹,而是从“下载确认”页面下载html。 在日志记录中,该脚本似乎每次都会自动选择一个新的下载确认字符串,通常应强制执行Google云端硬盘下载而不进行病毒扫描:

--2020-07-27 21:55:21--  https://drive.google.com/uc?export=download&confirm=umyj&id=19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8
Resolving drive.google.com (drive.google.com)... 74.125.142.138, 74.125.142.101, 74.125.142.100, ...
Connecting to drive.google.com (drive.google.com)|74.125.142.138|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘pix2pix/checkpoint/weights.zip’

2 个答案:

答案 0 :(得分:1)

尝试一下

!gdown --id 19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8

然后,您可以使用!mkdir创建一个新目录,或将weights.zip移至该目录。

答案 1 :(得分:1)

根据@korakot的回答,在Colab中实现结果的完整工作代码为:

!gdown https://drive.google.com/uc?id=19VsarMcYRNPLTDr6b6ABJyY8JUeBueL8
!mkdir /content/Person_remover/pix2pix/checkpoint
import shutil
shutil.move("/content/Person_remover/weights.zip", "/content/Person_remover/pix2pix/checkpoint")