Question

在尝试编写优化的DSP算法时，我想知道堆栈分配和堆分配之间的相对速度，以及堆栈分配的数组的大小限制。我意识到堆栈帧大小限制，但我不明白为什么以下运行，使用cargo test --release生成看似真实的基准测试结果，但在使用#![feature(test)] extern crate test; #[cfg(test)] mod tests { use test::Bencher; #[bench] fn it_works(b: &mut Bencher) { b.iter(|| { let stack = [[[0.0; 2]; 512]; 512]; }); } }运行时出现堆栈溢出失败。

function MinimalExample

MixedShade = [000 000 205; ...  % Medium Blue
              255 064 064; ...  % Brown
              000 233 255; ...  % Aqua
              255 185 015] ...  % Gold
             ./255;

alpha = [0.1:0.1:0.4];        
a = [1*1e0, 1*1e-1, 1*1e-2, 1*1e-3] ;
A = [1*a; 2*a; 3*a; 4*a];
b = [1*1e+1, 1*1e+2, 1*1e+3, 1*1e+4] ;
B = [1*b; 2*b; 3*b; 4*b];

for ii = 1:size(A, 2)
    semilogy(alpha, A(ii, :), 'color', MixedShade(ii, :), 'LineStyle', '-', ...
             'Marker','o', 'MarkerFaceColor', MixedShade(ii, :));
    hold on
end

for ii = 1:size(B, 2)
    semilogy(alpha, B(ii, :), 'color', MixedShade(ii, :), 'LineStyle', '-', ...
             'Marker', '>', 'MarkerFaceColor', MixedShade(ii, :));
    hold on
end

xlabel('$\alpha$', 'Interpreter', 'LaTex'); 
ylabel('$Cost$', 'Interpreter', 'LaTex'); 
grid on 
xlim([0.1 0.4]) 
legend('A,Set1', 'A,Set2', 'A,Set3', 'A,Set4', ...
       'B,Set1', 'B,Set2', 'B,Set3', 'B,Set4', ...
       'location','best');
set(get(gcf,'CurrentAxes'), 'FontName', 'Times', 'FontSize', 14);

Answer 1

为了解决问题，请注意阵列的大小为8×2×512×512 = 4 MiB。

cargo test崩溃，但cargo bench没有，因为“test”在新线程中调用函数it_works() ，而“bench”则调用< em>在主线程中。

主线程的默认堆栈大小通常为8 MiB，因此该阵列将占用可用堆栈的一半。这是很多，但仍然有空间，所以基准测试正常运行。

然而，stack size of a new thread通常要小得多。在Linux上它是2 MiB，and other platforms could be even smaller。因此，您的4 MiB阵列很容易溢出线程的堆栈并导致堆栈溢出/段错误。

您可以按setting the RUST_MIN_STACK environment variable增加新线程的默认堆栈大小。

$ RUST_MIN_STACK=8388608 cargo test

cargo test以并行线程运行测试以改善总测试时间，同时在同一线程中按顺序运行基准测试以降低噪声。

由于堆栈大小有限，在堆栈上分配此数组是个坏主意。您必须将其存储在堆（box它）或全局static mut。

货物测试 - release会导致堆栈溢出。为什么没有货舱？

1 个答案: