OpenMP FFT乘法加速

时间:2013-12-15 20:14:22

标签: recursion fft openmp

我有递归FFT算法用于乘法运算,我需要用openmp对其进行并行化。经过一些研究和尝试我得到了这个

Complex * multiply(Complex *p1, Complex *p2)
{
#pragma omp parallel
{
//evaluate p1
#pragma omp single nowait
pFFT(n,p1,1);

#pragma omp single nowait
pFFT(n,p2,1);
}

//...multiply part etc

}

void pFFT(int deg, Complex *pol,int sign)
{
if( deg == 1)
    return;

//divide polynom into two parts with even and odd coeficients
Complex *even = new Complex [deg/2];
Complex *odd = new Complex [deg/2];


for(int i = 0;i<deg/2;i++)
{
    even[i] = pol[2*i];
    odd[i]  = pol[2*i+1];
}


#pragma omp task
pFFT(deg/2,even,sign);
#pragma omp task
pFFT(deg/2,odd,sign);
#pragma omp taskwait
//wn = n-th root of unity
int x = lg2(deg);
Complex wn;
wn.re = pcos[x];
wn.im = sign*psin[x];
Complex w;
w.re = 1;
w.im = 0;
Complex *ret = pol;

Complex product;
if(deg==2)
{
        product = mul(odd,&w);
        ret[0].re = even[0].re+product.re;
        ret[0].im = even[0].im+product.im;
        ret[1].re = even[0].re-product.re;
        ret[1].im = even[0].im-product.im;
}
else
    for(int i = 0;i<deg/2-1;i+=2)
    {
        product = mul(odd+i,&w);
        ret[i].re = even[i].re+product.re;
        ret[i].im = even[i].im+product.im;
        ret[i+deg/2].re = even[i].re-product.re;
        ret[i+deg/2].im = even[i].im-product.im;
        w = mul(&w,&wn);
        product = mul(odd+i+1,&w);
        ret[i+1].re = even[i+1].re+product.re;
        ret[i+1].im = even[i+1].im+product.im;
        ret[i+1+deg/2].re = even[i+1].re-product.re;
        ret[i+1+deg/2].im = even[i+1].im-product.im;
        w = mul(&w,&wn);
    }
delete[] even;
delete[] odd;
}

但是代码甚至比顺序版本更慢,只有加速我能做的是,当我删除任务时,让2个线程同时计算每个多项式。我明白,有很多内存操作,但仍然有一些我可以/应该做的事情。比你。

1 个答案:

答案 0 :(得分:0)

您应该先停止并行化。尝试使用16,64,256的大小。在大小2处停止会产生太多的小任务,并且开销相对巨大。