我正在进行一项任务,我必须采用一个包含数据的大型矩阵,并以某种方式压缩数据,使其处于更易管理的形式。但是,需要将数据重新用作其他内容的输入。 (例如,工具箱)。这是我到目前为止所做的。对于这个示例矩阵,我使用find函数给出一个矩阵,其中包含值为非零的所有索引。但我不知道如何使用它作为输入,以保留原始图形信息。我很好奇其他人是否还有其他更好(简单)的解决方案。
number_1 = [0 0 0 0 0 0 0 0 0 0 ...
0 0 1 1 1 1 0 0 0 0 ...
0 1 1 0 1 1 0 0 0 0 ...
0 1 1 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 1 1 1 1 1 1 1 1 0 ...
0 0 0 0 0 0 0 0 0 0];
number = number_1;
compressed_number = find(number);
compressed_number = compressed_number';
disp(compressed_number)
答案 0 :(得分:1)
当你只有1和0,并且填充因子不是非常小时,最好的办法是将数字存储为二进制数;如果您需要原始尺寸,请单独保存。我已经扩展了代码,更清楚地显示了中间步骤,还显示了不同阵列所需的存储量。注意 - 我将您的数据重新整形为13x10阵列,因为它显示得更好。
number_1 = [0 0 0 0 0 0 0 0 0 0 ...
0 0 1 1 1 1 0 0 0 0 ...
0 1 1 0 1 1 0 0 0 0 ...
0 1 1 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 0 0 0 1 1 0 0 0 0 ...
0 1 1 1 1 1 1 1 1 0 ...
0 0 0 0 0 0 0 0 0 0];
n1matrix = reshape(number_1, 10, [])'; % make it nicer to display;
% transpose because data is stored column-major (row index changes fastest).
disp('the original data in 13 rows of 10:');
disp(n1matrix);
% create a matrix with 8 rows and enough columns
n1 = numel(number_1);
nc = ceil(n1/8); % "enough columns"
npad = zeros(8, nc);
npad(1:n1) = number_1; % fill the first n1 elements: the rest is zero
binVec = 2.^(7-(0:7)); % 128, 64, 32, 16, 8, 4, 2, 1 ... powers of two
compressed1 = uint8(binVec * npad); % 128 * bit 1 + 64 * bit 2 + 32 * bit 3...
% showing what we did...
disp('Organizing into groups of 8, and calculated their decimal representation:')
for ii = 1:nc
fprintf(1,'%d ', npad(:, ii));
fprintf(1, '= %d\n', compressed1(ii));
end
% now the inverse operation: using dec2bin to turn decimals into binary
% this function returns strings, so some further processing is needed
% original code used de2bi (no typo) but that requires a communications toolbox
% like this the code is more portable
decompressed = dec2bin(compressed1);
disp('the string representation of the numbers recovered:');
disp(decompressed); % this looks a lot like the data in groups of 8, but it's a string
% now we turn them back into the original array
% remember it is a string right now, and the values are stored
% in column-major order so we need to transpose
recovered = ('1'==decompressed'); % all '1' characters become logical 1
display(recovered);
% alternative solution #1: use logical array
compressed2 = (n1matrix==1);
display(compressed2);
recovered = double(compressed2); % looks just the same...
% other suggestions 1: use find
compressed3 = find(n1matrix); % fewer elements, but each element is 8 bytes
compressed3b = uint8(compressed); % if you know you have fewer than 256 elements
% or use `sparse`
compressed4 = sparse(n1matrix);
% or use logical sparse:
compressed5 = sparse((n1matrix==1));
whos number_1 comp*
the original data in 13 rows of 10:
0 0 0 0 0 0 0 0 0 0
0 0 1 1 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0
Organizing into groups of 8, and their decimal representation:
0 0 0 0 0 0 0 0 = 0
0 0 0 0 1 1 1 1 = 15
0 0 0 0 0 1 1 0 = 6
1 1 0 0 0 0 0 1 = 193
1 0 1 1 0 0 0 0 = 176
0 0 0 0 1 1 0 0 = 12
0 0 0 0 0 0 1 1 = 3
0 0 0 0 0 0 0 0 = 0
1 1 0 0 0 0 0 0 = 192
0 0 1 1 0 0 0 0 = 48
0 0 0 0 1 1 0 0 = 12
0 0 0 0 0 0 1 1 = 3
0 0 0 0 0 0 0 0 = 0
1 1 0 0 0 0 0 1 = 193
1 1 1 1 1 1 1 0 = 254
0 0 0 0 0 0 0 0 = 0
0 0 0 0 0 0 0 0 = 0
the string representation of the numbers recovered:
00000000
00001111
00000110
11000001
10110000
00001100
00000011
00000000
11000000
00110000
00001100
00000011
00000000
11000001
11111110
00000000
00000000
compressed2 =
0 0 0 0 0 0 0 0 0 0
0 0 1 1 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0
recovered =
0 0 0 0 0 0 0 0 0 0
0 0 1 1 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 1 1 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0
0 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0
Name Size Bytes Class Attributes
compressed1 1x17 17 uint8
compressed2 13x10 130 logical
compressed3 34x1 272 double
compressed3b 34x1 34 uint8
compressed4 13x10 632 double sparse
compressed5 13x10 394 logical sparse
number_1 1x130 1040 double
如您所见,原始数组占用1040个字节;压缩数组需要17个。你得到几乎64倍的压缩(不完全是因为132不是8的倍数);只有非常稀疏的数据集才能通过其他方式更好地压缩。唯一能够接近(而且速度超快)的是
compressed3b = uint8(find(number_1));
34个字节,它绝对是小型阵列(< 256个元素)的竞争者。
注意 - 当您在Matlab中保存数据时(使用save(fileName, 'variableName')
),会自动进行一些压缩。这导致了一个有趣且令人惊讶的结果。当您使用上述每个变量并使用Matlab的save
将它们保存到文件时,文件大小以字节为单位变为:
number_1 195
compressed1 202
compressed2 213
compressed3 219
compressed3b 222
compressed4 256
compressed5 252
另一方面,如果您使用
自己创建二进制文件fid = fopen('myFile.bin', 'wb');
fwrite(fid, compressed1)
fclose(fid)
默认情况下会写uint8
,因此文件大小为130,17,130,34,34 - 稀疏数组不能以这种方式写入。它仍然显示具有最佳压缩的“复杂”压缩。
答案 1 :(得分:0)
首先,您可以使用find
函数获取数组的所有非零索引,而不是手动执行。更多信息:http://www.mathworks.com/help/matlab/ref/find.html
无论如何,您不仅需要matrix
,还需要原始大小。因此,当您将matrix
传递给任何内容时,您还必须传入length(number_1)
。这是因为matrix
不会告诉你在最后一次之后有多少0。你可以通过从原始长度中减去矩阵的最后一个值来计算出来(那里可能有一个错误的错误) )。