我使用以下代码进行就地前向FIR滤波:
lena = len(a)
lenb = len(b)
convol = zeros(a.shape)
code = """
// Forward convolution
int pad = (lenb-1)/2;
int i, j;
for (i=pad; i<pad+lena; i++)
{
int kmin, kmax, k;
// Reverse indexing for the next pass
j = lena-1-i+pad;
convol(j) = 0;
kmin = (i >= lenb - 1) ? i - (lenb - 1) : 0;
kmax = (i < lena - 1) ? i : lena - 1;
for (k = kmin; k <= kmax; k++)
{
convol(j) += a(k)*b(i - k);
}
}
// Backward convolution (the signal in convol has been
// reversed using reversed indexes)
for (i=pad; i<pad+lena; i++)
{
int kmin, kmax, k;
// Reverse indexing for reordering the output vector
j = lena-1-i+pad;
a(j) = 0;
kmin = (i >= lenb - 1) ? i - (lenb - 1) : 0;
kmax = (i < lena - 1) ? i : lena - 1;
for (k = kmin; k <= kmax; k++)
{
a(j) += convol(k)*b(i - k);
}
}
return_val = 1;
"""
weave.inline(code, [ 'a', 'b', 'lena', 'lenb', 'convol'],
type_converters=converters.blitz, compiler = 'g++')
当然,“convol”变量不需要在C / C ++范围之外看到,我需要空间和处理最优性。因此,通过以下方式替换此代码是有道理的(至少对我而言):
lena = len(a)
lenb = len(b)
code = """
// Forward convolution
int pad = (lenb-1)/2;
int i, j;
float* convol = (float*) calloc(lena, sizeof(float));
for (i=pad; i<pad+lena; i++)
{
int kmin, kmax, k;
// Reverse indexing for the next pass
j = lena-1-i+pad;
convol[j] = 0;
kmin = (i >= lenb - 1) ? i - (lenb - 1) : 0;
kmax = (i < lena - 1) ? i : lena - 1;
for (k = kmin; k <= kmax; k++)
{
convol[j] += a(k)*b(i - k);
}
}
// Backward convolution (the signal in convol has been
// reversed using reversed indexes)
for (i=pad; i<pad+lena; i++)
{
int kmin, kmax, k;
// Reverse indexing for reordering the output vector
j = lena-1-i+pad;
a(j) = 0;
kmin = (i >= lenb - 1) ? i - (lenb - 1) : 0;
kmax = (i < lena - 1) ? i : lena - 1;
for (k = kmin; k <= kmax; k++)
{
a(j) += convol[k]*b(i - k);
}
}
free(convol);
return_val = 1;
"""
weave.inline(code, [ 'a', 'b', 'lena', 'lenb'],
type_converters=converters.blitz, compiler = 'g++')
唯一的区别是使用numpy数组而不是直接使用C float数组。问题是第二个代码的处理时间比第一个代码长2倍......为什么会这样?第二个版本有什么问题吗?它应该更快......!