我是Google Cloud Platform的新手。我已经在datalab上训练了我的模型,并将模型文件夹保存在我的存储桶中的云存储中。通过右键单击文件,我可以将存储桶中的现有文件下载到本地计算机 - >保存为链接。但是当我尝试按照上面相同的步骤下载文件夹时,我没有得到文件夹,而是它的图像。无论如何我可以下载整个文件夹及其内容吗?是否有任何gsutil命令将文件夹从云存储复制到本地目录?
答案 0 :(得分:5)
答案 1 :(得分:2)
gsutil -m cp -r gs:// bucket-name“ {本地现有文件夹的路径}”
可以肯定地工作。
答案 2 :(得分:2)
如果您要使用python从Google云存储中下载数据并希望保持相同的文件夹结构,请遵循我在python中编写的代码。
选项1
from google.cloud import storage
def findOccurrences(s, ch): # to find position of '/' in blob path ,used to create folders in local storage
return [i for i, letter in enumerate(s) if letter == ch]
def download_from_bucket(bucket_name, blob_path, local_dir):
# Create this folder locally
if not os.path.exists(local_path):
os.makedirs(local_path)
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blobs=list(bucket.list_blobs(prefix=blob_path))
startloc = 0
for blob in blobs:
startloc = 0
folderloc = findOccurrences(blob.name.replace(blob_path, ''), '/')
if(not blob.name.endswith("/")):
if(blob.name.replace(blob_path, '').find("/") == -1):
downloadpath=local_path + '/' + blob.name.replace(blob_path, '')
logging.info(downloadpath)
blob.download_to_filename(downloadpath)
else:
for folder in folderloc:
if not os.path.exists(local_path + '/' + blob.name.replace(blob_path, '')[startloc:folder]):
create_folder=local_path + '/' +blob.name.replace(blob_path, '')[0:startloc]+ '/' +blob.name.replace(blob_path, '')[startloc:folder]
startloc = folder + 1
os.makedirs(create_folder)
downloadpath=local_path + '/' + blob.name.replace(blob_path, '')
blob.download_to_filename(downloadpath)
logging.info(blob.name.replace(blob_path, '')[0:blob.name.replace(blob_path, '').find("/")])
logging.info('Blob {} downloaded to {}.'.format(blob_path, local_path))
bucket_name = 'google-cloud-storage-bucket-name' # do not use gs://
blob_path = 'training/data' # blob path in bucket where data is stored
local_dir = 'local-folder name' #trainingData folder in local
download_from_bucket(bucket_name, blob_path, local_dir)
选项2:使用gsutil sdk 下面是通过python程序执行此操作的另一种选择。
def download_bucket_objects(bucket_name, blob_path, local_path):
# blob path is bucket folder name
command = "gsutil cp -r gs://{bucketname}/{blobpath} {localpath}".format(bucketname = bucket_name, blobpath = blob_path, localpath = local_path)
os.system(command)
return Noneenter code here
答案 3 :(得分:1)
这是从 Google Cloud Storage Bucket 下载文件夹的方法
运行以下命令将其从存储桶存储下载到您的 Google Cloud Console 本地路径
gsutil -m cp -r gs://{bucketname}/{folderPath} {localpath}
运行该命令后,通过运行 ls
命令列出本地路径上的文件和目录,确认您的文件夹位于本地路径上
现在通过运行以下命令压缩您的文件夹
zip -r foldername.zp yourfolder/*
完成 zip 过程后,点击 Google Cloud Console 右侧的更多下拉菜单,
然后选择“下载文件”选项。系统会提示你输入要下载的文件名,输入zip文件名——“文件夹名.zp”
答案 4 :(得分:0)
先决条件: Google Cloud SDK已安装并初始化($ glcoud init)
命令:
gsutil -m cp -r gs://bucket-name .
这将使用更快的多线程复制所有文件。我发现指示在官方Gsutil文档中使用的“ dir”命令无效。
答案 5 :(得分:0)
这是我编写的代码。 这会将完整的目录结构下载到您的VM /本地存储中。
from google.cloud import storage
import os
bucket_name = "ar-data"
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
dirName='Data_03_09/' #***folder in bucket whose content you want to download
blobs = bucket.list_blobs(prefix = dirName)#, delimiter = '/')
destpath=r'/home/jupyter/DATA_test/' #***path on your vm/local where you want to download the bucket directory
for blob in blobs:
#print(blob.name.lstrip(dirName).split('/'))
currpath=destpath
if not os.path.exists(os.path.join(destpath,'/'.join(blob.name.lstrip(dirName)).split('/')[:-1])):
for n in blob.name.lstrip(dirName).split('/')[:-1]:
currpath=os.path.join(currpath,n)
if not os.path.exists(currpath):
print('creating directory- ', n , 'On path-', currpath)
os.mkdir(currpath)
print("downloading ... ",blob.name.lstrip(dirName))
blob.download_to_filename(os.path.join(destpath,blob.name.lstrip(dirName)))
或仅在终端中使用:
gsutil -m cp -r gs://{bucketname}/{folderPath} {localpath}