GCP python文档的脚本具有以下功能:
def upload_pyspark_file(project_id, bucket_name, filename, file):
"""Uploads the PySpark file in this directory to the configured
input bucket."""
print('Uploading pyspark file to GCS')
client = storage.Client(project=project_id)
bucket = client.get_bucket(bucket_name)
blob = bucket.blob(filename)
blob.upload_from_file(file)
我在我的脚本中创建了一个参数解析函数,它接受多个参数(文件名)以上传到GCS存储桶。我正在尝试调整上面的函数来解析那些多个args并上传这些文件,但我不确定如何继续。我的困惑在于上面的'filename'和'file'变量。我如何根据具体目的调整功能?
答案 0 :(得分:1)
我不认为你还在寻找这样的东西吗?
from google.cloud import storage
import os
files = os.listdir('data-files')
client = storage.Client.from_service_account_json('cred.json')
bucket = client.get_bucket('xxxxxx')
def upload_pyspark_file(filename, file):
# """Uploads the PySpark file in this directory to the configured
# input bucket."""
# print('Uploading pyspark file to GCS')
# client = storage.Client(project=project_id)
# bucket = client.get_bucket(bucket_name)
print('Uploading from ', file, 'to', filename)
blob = bucket.blob(filename)
blob.upload_from_file(file)
for f in files:
upload_pyspark_file(f, "data-files\\{0}".format(f))
file
和filename
之间的区别是您可能已经猜到的,file
是源文件,filename
是目标文件。