Question

我发现使用带有'int24'选项的Matlab'fread'读取以24位整数格式打包的数据需要花费大量时间。我发现，如果我读取'int32'或'int16'或'int8'中的数据，与'int24'相比，读取时间会非常快。有没有更好的方法来减少读取24位整数数据的时间？

要了解问题，请在下面给出示例代码。

clear all; close all; clc;

% generate some data and write it as a binary file
n=10000000;
x=randn(n,1);
fp=fopen('file1.bin', 'w');
fwrite(fp, x);
fclose(fp);

% read data in 24-bit format and measure the time
% please note that the data we get here will be different from 'x'.
% The sole purpose of the code is to demonstrate the delay in reading
% 'int24'

tic;
fp=fopen('file1.bin', 'r');
y1=fread(fp, n, 'int24');
fclose(fp);
toc;


% read data in 32-bit format and measure the time

% please note that the data we get here will be different from 'x'.
% The sole purpose of the code is to demonstrate the delay in reading
% 'int24'
tic;
fp=fopen('file1.bin', 'r');
y2=fread(fp, n, 'int32');
fclose(fp);
toc;

输出显示：经过的时间是1.066489秒。经过的时间是0.047944秒。

虽然'int32'版本读取更多数据（32 * n位），但它比'int24'读取快25倍。

Answer 1

通过将数据读取为无符号8位整数并将每组三个字节组合成等效的24位数，我能够实现大约4倍的加速。请注意，这假定未签名的小端值，因此您必须修改它以考虑签名或大端数据：

>> tic;
>> fp = fopen('file1.bin', 'r');
>> y1 = fread(fp, n, 'bit24');
>> fclose(fp);
>> toc;
Elapsed time is 0.593552 seconds.

>> tic;
>> fp = fopen('file1.bin', 'r');
>> y2 = double(fread(fp, n, '*uint8'));  % This is fastest, for some reason
>> y2 = [1 256 65536]*reshape([y2; zeros(3-rem(numel(y2), 3), 1)], 3, []);
>> fclose(fp);
>> toc;
Elapsed time is 0.143388 seconds.

>> isequal(y1,y2.')  % Test for equality of the values

ans =

     1

在上面的代码中，我只用{0}填充y2以匹配y1的大小。向量y2也最终是行向量而不是列向量，如果需要，可以通过简单的转置来更改。出于某种原因，fread首先将值输出为uint8，然后将它们转换为double比任何其他选项更快（即通过生成最后一个参数直接输出到double 'uint8'或'uint8=>double'）。

Answer 2

首先，最好使用bitn代替int*。

如果您将int24更改为bit32，则代码运行速度会一样慢。所以我认为这不是你读了多少比特，而是使用bitn的内在性质。

Matlab在读取24位整数时变慢

2 个答案: