项目库文档显示了如何将熊猫数据框保存到项目资产:
# Import the lib
from project_lib import Project
project = Project(sc,"<ProjectId>", "<ProjectToken>")
# let's assume you have the pandas DataFrame pandas_df which contains the data
# you want to save in your object storage as a csv file
project.save_data("file_name.csv", pandas_df.to_csv())
# the function returns a dict which contains the asset_id, bucket_name and file_name
# upon successful saving of the data
但是,如果我有本地文件...
! wget url_to_binary_file
然后如何将该文件上传到项目资产?
答案 0 :(得分:0)
我需要以字节为单位读取文件。请注意,这会将文件读入内存,请勿尝试这样做,因为您的文件大于可用内存:
import io
filename = ‘thefilename’
with open(filename, 'rb') as z:
data = io.BytesIO(z.read())
project.save_data(
filename, data, set_project_asset=True, overwrite=True
)