我有一个项目,我需要在s3存储桶中将数据帧写入xlsx。 从s3加载带有熊猫的文件非常简单,方法是: df = pd.read_excel('s3://path/file.xlsx')
但是将文件写入s3给我带来了问题。
import pandas as pd
# Create a Pandas dataframe from the data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('s3://path/', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
FileNotFoundError: [Errno 2] No such file or directory: 's3://path'
那么我该如何使用熊猫(最好是使用制表符)将xlsx文件写入s3?
答案 0 :(得分:2)
import io
import boto3
import xlsxwriter
import pandas as pd
bucket = 'your-s3-bucketname'
filepath = 'path/to/your/file.format'
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})
with io.BytesIO() as output:
with pd.ExcelWriter(output, engine='xlsxwriter') as writer:
df.to_excel(writer, 'sheet_name')
data = output.getvalue()
s3 = boto3.resource('s3')
s3.Bucket(bucket).put_object(Key=filepath, Body=data)