我正在尝试从aws s3存储桶中读取多个实木复合地板,并将它们全部转换为一个大熊猫数据帧。我有:
bucket = s3.Bucket(name='mybucket')
objects = []
keys = []
for obj in bucket.objects.all():
subsrc = obj.Object()
key = obj.key
body = obj.get()['Body'].read()
objects.append(body)
keys.append(key)
但是当我打印对象[0]时,它只是字母“ b”
我也在考虑做类似的事情:
count = 0
for file in bucket.objects.all():
obj = s3.get_object(Bucket="my-bucket", Key=keys[count])
obj_df = pd.read_parquet(obj["Body"])
df_list.append(obj_df)
count+=1
但这给了我
AttributeError: 's3.ServiceResource' object has no attribute 'get_object'
然后当我注释掉get_object行时,我得到了:
TypeError: Cannot convert bytes to pyarrow.lib.NativeFile
非常感谢您!