如何阅读Middlebury数据集提供的.pfm文件?

时间:2016-05-06 13:00:13

标签: python file endianness

数据集在这里:http://vision.middlebury.edu/stereo/data/scenes2014/

PFM文件说明在这里:http://davis.lbl.gov/Manuals/NETPBM/doc/pfm.html

我正在尝试按照以下代码阅读文件:

      header = file.readline().rstrip()
      if header == 'PF':
        color = True    
      elif header == 'Pf':
        color = False
      else:
        raise Exception('Not a PFM file.')

      dim_match = re.match(r'^(\d+)\s(\d+)\s$', file.readline())
      if dim_match:
        width, height = map(int, dim_match.groups())
      else:
        raise Exception('Malformed PFM header.')

      scale = float(file.readline().rstrip())
      if scale < 0: # little-endian
        endian = '<'
        scale = -scale
      else:
        endian = '>' # big-endian
      data = np.fromfile(file, endian + 'f')
      shape = (height, width, 3) if color else (height, width)
      return np.reshape(data, (shape[0]-1, shape[1])), scale

但最终我的数组中出现了很奇怪的值。这只是我试图阅读它的一个变体,但永远不会得到看起来正确的结果。因此,如果有人帮助理解如何正确阅读这些文件,那就太棒了。

我正在使用Windows和Python 2.7.11

1 个答案:

答案 0 :(得分:0)

import numpy as np
import re

def read_pfm(file):
        # Adopted from https://stackoverflow.com/questions/48809433/read-pfm-format-in-python
        with open(file, "rb") as f:
            # Line 1: PF=>RGB (3 channels), Pf=>Greyscale (1 channel)
            type = f.readline().decode('latin-1')
            if "PF" in type:
                channels = 3
            elif "Pf" in type:
                channels = 1
            else:
                sys.exit(1)
            # Line 2: width height
            line = f.readline().decode('latin-1')
            width, height = re.findall('\d+', line)
            width = int(width)
            height = int(height)

            # Line 3: +ve number means big endian, negative means little endian
            line = f.readline().decode('latin-1')
            BigEndian = True
            if "-" in line:
                BigEndian = False
            # Slurp all binary data
            samples = width * height * channels;
            buffer = f.read(samples * 4)
            # Unpack floats with appropriate endianness
            if BigEndian:
                fmt = ">"
            else:
                fmt = "<"
            fmt = fmt + str(samples) + "f"
            img = unpack(fmt, buffer)
        return img, height, width

    depth_img, height, width = read_pfm(gt_img)
        depth_img = np.array(depth_img)
        # Convert from the floating-point disparity value d [pixels] in the .pfm file to depth Z [mm]
        depths = baseline * focal_length / (depth_img + doffs)
        depths = np.reshape(depths, (height, width))
        depths = np.fliplr([depths])[0]
        plt.imshow(depths)
        plt.show()

I know, I am super late on this. This just works fine.