Question

我是一名新的MATLAB用户，只有很少的编程经验（我有机械工程背景）所以如果这是一个简单的问题我会提前道歉！

我正在尝试将大型点云文件（.pts文件扩展名）导入MATLAB进行处理。我认为该文件包含一个文本标题和3列整数数据（x，y和z坐标） - 我设法将文件的第一部分打开为文本文件，就是这种情况。

我无法将文件直接导入MATLAB，因为它太大（8.75亿点）并且一次只能导入9000000行，因此我编写了下面的脚本来导入文件（并因此保存）为9000000x3块，保存为MATLAB文件（或其他适当的格式）。

脚本：

filename='pointcloud.pts';
fid = fopen(filename,'r');
frewind(fid);
header=fread(fid,8,'*char');
points=fread(fid,1,'*int32');
pointsinpass=9000000;
numofpasses=(points/pointsinpass)
counter = 1;

while counter <= numofpasses;

   clear block;

   block=zeros(pointsinpass,3);


    for p=1:pointsinpass;
      block(p,[1:3])=fread(fid, 1,'float');
    end;

    indx=counter;
    filename=sprintf('block%d',indx);
    save (filename), block;


    disp('Iteration')
    disp(counter)
    disp('complete')
    counter=counter+1;


end;
fclose(fid);

脚本运行正常并循环5次迭代，导入5个数据块。然后，当它尝试导入第6个块时，我收到以下错误：

Subscripted assignment dimension mismatch.

Error in LiDARread_attempt5 (line 22)
          block(p,[1:3])=fread(fid, 1,'float');

我不确定导致错误的原因，我相信它与fread命令大小有关，因为我已经尝试了各种值，例如3，这使得在维度不匹配之前只能导入一个块发生错误。

如果我遗漏一些非常基本的东西，我再一次道歉，我对编程技术的理解非常有限，仅在几个月前才引入。

Answer 1

某些时候fread()会将[]返回为空。

我可以展示如何重现错误：

a = zeros(2,2)
a =
     0     0
     0     0
a(2,1:2) = []

Subscripted assignment dimension mismatch.

我建议使用textscan()代替fread()。

Answer 2

Matlab是一个很棒的工具，但对于大数据问题，我发现它很难实现。虽然它代表了一个学习曲线，我可以建议你看看python吗？很多年前我从matlab切换到python，并且在此过程中没有回头太多。

Spyder是一个功能强大的IDE http://code.google.com/p/spyderlib/，它应该为matlab用户提供良好的桥梁。用于Windows的Pythonxy http://code.google.com/p/pythonxy/将为您提供在该平台上高效工作所需的所有工具，但最后我检查了它只支持32位地址空间。如果你需要在Windows上支持64位，https://stackoverflow.com/users/453463/cgohlke在http://www.lfd.uci.edu/~gohlke/pythonlibs/提供了很棒的软件包。当然在Linux上，所有必需的软件包都可以非常容易地安装。在所有情况下，您都需要使用python2.7才能与必需的软件包完全兼容

我不知道你的问题的所有细节，但使用numpy memmap数据结构可能会有所帮助。它允许从磁盘操作大型阵列，而无需将整个阵列加载到主存储器中。它为您照顾内部。

基本上你所做的就是：

##memmap example
#notice we first use the mdoe w+ to create.  Subsequent reads 
#(and modifications can use r+)
fpr = np.memmap('MemmapOutput', dtype='float32', mode='w+', shape=(3000000,4))
fpr = numpy.random.rand(3000000,4)
del fpr #this frees the array and flushes to disk
fpr = np.memmap('MemmapOutput', dtype='float32', mode='r+', shape=(3000000,4))
fpr = numpy.random.rand(3000000,4)#reassign the values - in general you might not need to modify the array. but it can be done
columnSums = fpr.sum(axis=1) #notice you can use all the numpy functions seamlessly
del fpr #best to close the array again when done proces

请不要采取错误的方式。我并不想说服你放弃使用matlab，而是考虑在你的工具集中添加另一个工具。

将大型点云数据文件导入MATLAB

2 个答案: