Question

在学习Rayon的过程中，我想比较斐波那契数列的并行计算和串行计算的性能。这是我的代码：

use rayon;
use std::time::Instant;

fn main() {
    let nth = 30;
    let now = Instant::now();
    let fib = fibonacci_serial(nth);
    println!(
        "[s] The {}th number in the fibonacci sequence is {}, elapsed: {}",
        nth,
        fib,
        now.elapsed().as_micros()
    );

    let now = Instant::now();
    let fib = fibonacci_parallel(nth);
    println!(
        "[p] The {}th number in the fibonacci sequence is {}, elapsed: {}",
        nth,
        fib,
        now.elapsed().as_micros()
    );
}

fn fibonacci_parallel(n: u64) -> u64 {
    if n <= 1 {
        return n;
    }

    let (a, b) = rayon::join(|| fibonacci_parallel(n - 2), || fibonacci_parallel(n - 1));
    a + b
}

fn fibonacci_serial(n: u64) -> u64 {
    if n <= 1 {
        return n;
    }

    fibonacci_serial(n - 2) + fibonacci_serial(n - 1)
}

Run in Rust Playground

我希望并行计算的时间要比串行计算的时间短，但结果却相反：

# `s` stands for serial calculation and `p` for parallel
[s] The 30th number in the fibonacci sequence is 832040, elapsed: 12127
[p] The 30th number in the fibonacci sequence is 832040, elapsed: 990379

我的串行/并行计算实现存在缺陷。但是，如果没有，为什么我会看到这些结果？

Answer 1

我认为真正的原因是，您创建的n²线程不是很好。在fibonacci_parallel的每次调用中，您都会为人造丝创建另一对线程，并且由于您在闭包中再次调用fibonacci_parallel，因此会创建另一对线程。
这对于OS /人造丝来说非常糟糕。

解决此问题的方法可能是：

fn fibonacci_parallel(n: u64) -> u64 {
    fn inner(n: u64) -> u64 {
        if n <= 1 { 
            return n;
        }   

        inner(n - 2) + inner(n - 1)
    }   

    if n <= 1 {
        return n;
    }   

    let (a, b) = rayon::join(|| inner(n - 2), || inner(n - 1));
    a + b 
}

您创建两个都执行内部函数的线程。有了这个加法，我得到了

op@VBOX /t/t/foo> cargo run --release 40
    Finished release [optimized] target(s) in 0.03s
     Running `target/release/foo 40`
[s] The 40th number in the fibonacci sequence is 102334155, elapsed: 1373741
[p] The 40th number in the fibonacci sequence is 102334155, elapsed: 847343

但是如上所述，对于低数量的并行执行是不值得的：

op@VBOX /t/t/foo> cargo run --release 20
    Finished release [optimized] target(s) in 0.02s
     Running `target/release/foo 20`
[s] The 10th number in the fibonacci sequence is 6765, elapsed: 82
[p] The 10th number in the fibonacci sequence is 6765, elapsed: 241

为什么基于人造丝的并行处理比串行处理需要更多的时间？

1 个答案: