如何在python中读取一个大的tif文件?

时间:2015-05-26 17:58:13

标签: python numpy python-imaging-library tiff

我正在从http://oceancolor.gsfc.nasa.gov/DOCS/DistFromCoast/

加载tiff文件
from PIL import Image
im = Image.open('GMT_intermediate_coast_distance_01d.tif')

数据很大(im.size=(36000, 18000) 1.3GB),传统转换不起作用;即,imarray.shape返回()

import numpy as np 
imarray=np.zeros(im.size)
imarray=np.array(im)

如何将此tiff文件转换为numpy.array

4 个答案:

答案 0 :(得分:2)

对于这张图片,你可能没有太多Ram。你需要至少超过1.3GB的可用内存。

我不知道您对图像做了什么,并且您将整个内容读入了内存,但我建议您一点一点地阅读它,如果它可以避免炸毁您的计算机。 您可以使用每次返回一个像素的age.cpp:17:1: error: ‘time’ does not name a type time Add(time a1,time a2){ ^ age.cpp: In function ‘int main()’: age.cpp:42:7: error: expected ‘;’ before ‘a1’ time a1,a2,t3; ^ age.cpp:43:2: error: ‘a1’ was not declared in this scope a1.gettime(); ^ age.cpp:44:2: error: ‘a2’ was not declared in this scope a2.gettime(); ^ age.cpp:45:2: error: ‘t3’ was not declared in this scope t3=Add(a1,a2); ^ age.cpp:45:14: error: ‘Add’ was not declared in this scope t3=Add(a1,a2); ^

还可以在此链接上阅读Image.getdata()的更多内容:

http://www.pythonware.com/library/pil/handbook/

答案 1 :(得分:2)

到目前为止,我已经测试了许多替代方案,但只有gdal才能使用大量的16位图像。

您可以使用以下内容打开图片:

from osgeo import gdal
import numpy as np
ds = gdal.Open("name.tif")
channel = np.array(ds.GetRasterBand(1).ReadAsArray())

答案 2 :(得分:1)

对于 Python 32 位 2.7 版,您受到在给定时间可以添加到堆栈的字节数的限制。一种选择是分部分读入图像,然后调整各个块的大小并将它们重新组合成需要较少 RAM 的图像。

我建议为此使用软件包 libtiffopencv

    import os
    os.environ["PATH"] += os.pathsep + "C:\\Program Files (x86)\\GnuWin32\\bin"
    import numpy as np
    import libtiff
    import cv2

    tif = libtiff.TIFF.open("HUGETIFFILE.tif", 'r')
    width = tif.GetField("ImageWidth")
    height = tif.GetField("ImageLength")
    bits = tif.GetField('BitsPerSample')
    sample_format = tif.GetField('SampleFormat')
    
    ResizeFactor = 10 #Reduce Image Size by 10
    Chunks = 8 #Read Image in 8 Chunks to prevent Memory Error (can be increased for 
    # bigger files)

    ReadStrip = tif.ReadEncodedStrip
    typ = tif.get_numpy_type(bits, sample_format)


    #ReadStrip
    newarr = np.zeros((1, width/ResizeFactor), typ)
    for ii in range(0,Chunks):
        pos = 0
        arr = np.empty((height/Chunks, width), typ)
        size = arr.nbytes
        for strip in range((ii*tif.NumberOfStrips()/Chunks),((ii+1)*tif.NumberOfStrips()/Chunks)):
            elem = ReadStrip(strip, arr.ctypes.data + pos, max(size-pos, 0))
            pos = pos + elem

        resized = cv2.resize(arr, (0,0), fx=float(1)/float(ResizeFactor), fy=float(1)/float(ResizeFactor))

        # Now remove the large array to free up Memory for the next chunk
        del arr
        # Finally recombine the individual resized chunks into the final resized image.
        newarr = np.vstack((newarr,resized))

    newarr = np.delete(newarr, (0), axis=0)
    cv2.imwrite('resized.tif', newarr)

答案 3 :(得分:0)

我有1到3 GB的巨大tif文件,并在将Image.py源代码中的MAX_IMAGE_PIXELS值手动更改为任意大的数目之后,最终设法通过Image.open()打开了它们:

from PIL import Image
im = np.asarray(Image.open("location/image.tif")