通过Python客户端库将多个文件上传到Google云端存储

时间:2017-09-19 20:03:26

标签: google-cloud-platform google-cloud-storage google-cloud-dataproc google-cloud-python

GCP python文档的脚本具有以下功能:

def upload_pyspark_file(project_id, bucket_name, filename, file):
      """Uploads the PySpark file in this directory to the configured
      input bucket."""
      print('Uploading pyspark file to GCS')
      client = storage.Client(project=project_id)
      bucket = client.get_bucket(bucket_name)
      blob = bucket.blob(filename)
      blob.upload_from_file(file)

我在我的脚本中创建了一个参数解析函数,它接受多个参数(文件名)以上传到GCS存储桶。我正在尝试调整上面的函数来解析那些多个args并上传这些文件,但我不确定如何继续。我的困惑在于上面的'filename'和'file'变量。我如何根据具体目的调整功能?

1 个答案:

答案 0 :(得分:1)

我不认为你还在寻找这样的东西吗?

from google.cloud import storage
import os

files = os.listdir('data-files')
client = storage.Client.from_service_account_json('cred.json')
bucket = client.get_bucket('xxxxxx')


def upload_pyspark_file(filename, file):
    # """Uploads the PySpark file in this directory to the configured
    # input bucket."""
    # print('Uploading pyspark file to GCS')
    # client = storage.Client(project=project_id)
    # bucket = client.get_bucket(bucket_name)
    print('Uploading from ', file, 'to', filename)
    blob = bucket.blob(filename)
    blob.upload_from_file(file)


for f in files:
    upload_pyspark_file(f, "data-files\\{0}".format(f))

filefilename之间的区别是您可能已经猜到的,file是源文件,filename是目标文件。