我有以下代码
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import boto3
s3 = boto3.resource('s3', region_name='us-east-2')
bucket = s3.Bucket('sentinel-s2-l1c')
object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')
object.download_file('B01.jp2')
img=mpimg.imread('B01.jp2')
imgplot = plt.imshow(img)
plt.show(imgplot)
它有效。但它首先将文件下载到当前目录的问题。是否可以直接在RAM中读取文件并将其解码为图像?
答案 0 :(得分:18)
我建议使用io module将文件直接读入内存,而不必使用临时文件。
例如:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import boto3
import io
s3 = boto3.resource('s3', region_name='us-east-2')
bucket = s3.Bucket('sentinel-s2-l1c')
object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')
file_stream = io.StringIO()
object.download_fileobj(file_stream)
img = mpimg.imread(file_stream)
# whatever you need to do
如果您的数据是二进制文件,也可以使用io.BytesIO
。
答案 1 :(得分:11)
我想建议在tempfile
模块中使用Python NamedTemporaryFile。它会创建临时文件,在文件关闭时将被删除(感谢@NoamG)
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import boto3
import tempfile
s3 = boto3.resource('s3', region_name='us-east-2')
bucket = s3.Bucket('sentinel-s2-l1c')
object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')
tmp = tempfile.NamedTemporaryFile()
with open(tmp.name, 'wb') as f:
object.download_fileobj(f)
img=mpimg.imread(tmp.name)
# ...Do jobs using img
答案 2 :(得分:6)
通过在imread()
中指定文件格式,可以流式传输图像。
import boto3
from io import BytesIO
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
resource = boto3.resource('s3', region_name='us-east-2')
bucket = resource.Bucket('sentinel-s2-l1c')
image_object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')
image = mpimg.imread(BytesIO(image_object.get()['Body'].read()), 'jp2')
plt.figure(0)
plt.imshow(image)
答案 3 :(得分:2)
使用客户端的方法略有不同:
import boto3
import io
from matplotlib import pyplot as plt
client = boto3.client("s3")
bucket='my_bucket'
key= 'my_key'
outfile = io.BytesIO()
client.download_fileobj(bucket, key, outfile)
outfile.seek(0)
img = plt.imread(outfile)
plt.imshow(img)
plt.show()
答案 4 :(得分:0)
object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')
img_data = object.get().get('Body').read()
答案 5 :(得分:0)
根据格雷格·梅里特(Greg Merritt)的答案进行的进一步开发,以解决注释部分中的所有错误,使用BytesIO
代替StringIO
,使用PIL Image
代替matplotlib.image
。
以下功能适用于python3
和boto3
。同样,write_image_to_s3
函数也是一个奖励。
from PIL import Image
from io import BytesIO
import numpy as np
def read_image_from_s3(bucket, key, region_name='ap-southeast-1'):
"""Load image file from s3.
Parameters
----------
bucket: string
Bucket name
key : string
Path in s3
Returns
-------
np array
Image array
"""
s3 = boto3.resource('s3', region_name='ap-southeast-1')
bucket = s3.Bucket(bucket)
object = bucket.Object(key)
response = object.get()
file_stream = response['Body']
im = Image.open(file_stream)
return np.array(im)
def write_image_to_s3(img_array, bucket, key, region_name='ap-southeast-1'):
"""Write an image array into S3 bucket
Parameters
----------
bucket: string
Bucket name
key : string
Path in s3
Returns
-------
None
"""
s3 = boto3.resource('s3', region_name)
bucket = s3.Bucket(bucket)
object = bucket.Object(key)
file_stream = BytesIO()
im = Image.fromarray(img_array)
im.save(file_stream, format='jpeg')
object.put(Body=file_stream.getvalue())
答案 6 :(得分:0)
Hyeungshik Jung的临时文件解决方案看起来不错,但我注意到该文件似乎以某种懒惰的方式下载。这会导致以下行为:如果您调用img.shape()
,即使调用了()
,也将得到一个空的维度元组作为返回值object.download_fileobj(f)
。我通过将f.seek(0,2)
应用于文件描述符解决了此问题-然后所有以下操作均正常运行,例如返回所有适当的尺寸(704, 1024)
。
...
tmp = tempfile.NamedTemporaryFile()
with open(tmp.name, 'wb') as f:
object.download_fileobj(f)
f.seek(0,2)
img=mpimg.imread(tmp.name)
print (img.shape)