使用DFT制作简单的低通滤波器时遇到了一些麻烦。最后,我希望能够实时音频切换音频,但就目前而言,我甚至无法做到这一点。我没有在这个领域接受过培训,我只知道FFT会将频率改变为频率,而iFFT会改变频率,还有其他一些我读过的东西。说实话,我很惊讶它的工作原理和迄今为止一样。无论如何这里是代码:
byte[] samples = new byte[20000000];
int spos = 0;
此处使用8Bit Unsigned PCM填充 samples
。 spos
< - 样本数量
int samplesize = 128;
int sampleCount = spos / samplesize;
frequencies = new System.Numerics.Complex[sampleCount][];
for (int i = 0; i < sampleCount; i++)
{
Console.WriteLine("Sample " + i + " / " + sampleCount);
frequencies[i] = new System.Numerics.Complex[samplesize];
for (int j = 0; j < samplesize; j++)
{
frequencies[i][j] = (float)(samples[i * samplesize + j] - 128) / 128.0f;
}
dft.Radix2Forward(frequencies[i], MathNet.Numerics.IntegralTransforms.FourierOptions.Default);
}
int shiftUp = 1000; //1khz
int fade = 2; //8 sample fade.
int kick = frequencies[0].Length * shiftUp / rate;
所以现在我已经为输入的128个样本部分计算了一堆DFT。 kick
是(我希望)DFT中跨越1000Hz的样本数。 I.E由于frequencies.Length / 2
包含频率幅度数据高达rate/2
Hz,因此frequencies[0].Length / 2 * shiftUp / (rate / 2)
= frequencies[0].Length * shiftUp / rate
应该为我提供正确的值
for (int i = 0; i < sampleCount; i++)
{
这是我遇到麻烦的部分。没有它,输出听起来很棒!这会跳过索引0和索引64.这两个都有一个0的复杂组件,我记得在某处读到索引0的值很重要......
for (int j = 0; j < frequencies[i].Length; j++)
{
if (j == 0 || j == 64)
continue;
if (j < 64)
{
if (!(j < kick + 1))
{
frequencies[i][j] = 0;
}
}
else
{
if (!(j - 64 > 63 - kick))
{
frequencies[i][j] = 0;
}
}
}
最后它取消了转换
dft.Radix2Inverse(frequencies[i], MathNet.Numerics.IntegralTransforms.FourierOptions.Default);
...将它扔回样本数组
for (int j=0; j<samplesize; j++)
samples[i * samplesize + j] = (byte)(frequencies[i][j].Real * 128.0f + 128.0f);
}
...将其放入文件
BinaryWriter bw = new BinaryWriter(File.OpenWrite("sound"));
for (int i = 0; i < spos; i++)
{
bw.Write(samples[i]);
}
bw.Close();
...然后我将它导入Audacity,用文物谋杀我的耳朵。
光谱显示表明代码在某种程度上有效
然而,在整首歌曲中都会出现令人讨厌的高音噼啪声。我听说过有关Gibbs现象和窗口函数的内容,但我真的不知道如何在这里应用它。 fade
变量是我对窗口函数的最佳尝试:超过1000hz标记的所有内容在2个样本中逐渐变为0。
有什么想法吗?
谢谢!
答案 0 :(得分:0)
所以事实证明我是对的(yay):每1024个样本我都会发出咔哒声,这让声音听起来很糟糕。为了解决这个问题,我在许多短重叠的过滤音频块之间褪色。它并不快,但它确实有效,我很确定这就是“窗口”的意思
public class OggDFT
{
int sample_length;
byte[] samples;
DragonOgg.MediaPlayer.OggFile f;
int rate = 0;
System.Numerics.Complex[][] frequencies;
DiscreteFourierTransform dft = new DiscreteFourierTransform();
int samplespacing = 128;
int samplesize = 1024;
int sampleCount;
public void ExampleLowpass()
{
int shiftUp = 1000; //1khz
int fade = 2; //8 sample fade.
int halfsize = samplesize / 2;
int kick = frequencies[0].Length * shiftUp / rate;
for (int i = 0; i < sampleCount; i++)
{
for (int j = 0; j < frequencies[i].Length; j++)
{
if (j == 0 || j == halfsize)
continue;
if (j < halfsize)
{
if (!(j < kick + 1))
{
frequencies[i][j] = 0;
}
}
else
{
if (!(j - halfsize > halfsize - 1 - kick))
{
frequencies[i][j] = 0;
}
}
}
dft.BluesteinInverse(frequencies[i], MathNet.Numerics.IntegralTransforms.FourierOptions.Default);
}
}
public OggDFT(DragonOgg.MediaPlayer.OggFile f)
{
Complex[] c = new Complex[10];
for (int i = 0; i < 10; i++)
c[i] = i;
ShiftComplex(-2, c, 5, 10);
this.f = f;
//Make a 20MB buffer.
samples = new byte[20000000];
int sample_length = 0;
//This block here simply loads the uncompressed data from the ogg file into a nice n' large 20MB buffer. If you want to use the same library as I've used, It's called DragonOgg (If you cant tell by the namespace)
while (sample_length < samples.Length)
{
var bs = f.GetBufferSegment(4096); //Get ~4096 bytes (does not gurantee that 4096 bytes will be returned.
if (bs.ReturnValue == 0)
break; //End of file
//Set the rate
rate = bs.RateHz;
//Display some loading info:
Console.WriteLine("seconds: " + sample_length / rate);
//It's stereo so we want half the data.
int max = bs.ReturnValue / 2;
//Buffer overflow care.
if (samples.Length - sample_length < max)
max = samples.Length - sample_length;
//The copier.
for (int j = 0; j < max; j++)
{
//I'm using j * 2 here because I know that the input audio is 8Bit Stereo, and we want just one mono channel. So we skip every second one.
samples[sample_length + j] = bs.Buffer[j * 2];
}
sample_length += max;
if (max == 0)
break;
}
sampleCount = (sample_length - 1) / samplespacing + 1;
frequencies = new System.Numerics.Complex[sampleCount][];
for (int i = 0; i < sample_length; i += samplespacing)
{
Console.WriteLine("Sample---" + i + " / " + sample_length);
System.Numerics.Complex[] sample;
if (i + samplesize > sample_length)
sample = new System.Numerics.Complex[sample_length - i];
else
sample = new System.Numerics.Complex[samplesize];
for (int j = 0; j < sample.Length; j++)
{
sample[j] = (float)(samples[i + j] - 128) / 128.0f;
}
dft.BluesteinForward(sample, MathNet.Numerics.IntegralTransforms.FourierOptions.Default);
frequencies[i / samplespacing] = sample;
}
//Perform the filters to the frequencies
ExampleLowpass();
//Make window kernel thingy
float[] kernel = new float[samplesize / samplespacing * 2];
for (int i=0; i<kernel.Length; i++)
{
kernel[i] = (float)((1-Math.Cos(2*Math.PI*i/(kernel.Length - 1)))/2);
}
//Apply window kernel thingy
for (int i = 0; i < sample_length; i++)
{
int jstart = i / samplespacing - samplesize / samplespacing + 1;
int jend = i / samplespacing;
if (jstart < 0) jstart = 0;
float ktotal = 0;
float stotal = 0;
for (int j = jstart; j <= jend; j++)
{
float kernelHere = 1.0f;
if (jstart != jend)
kernelHere = kernel[(j - jstart) * kernel.Length / (jend + 1 - jstart)];
int index = i - j * samplespacing;
stotal += (float)frequencies[j][index].Real * kernelHere;
ktotal += kernelHere;
}
if (ktotal != 0)
{
stotal /= ktotal;
samples[i] = (byte)(stotal * 128 * 0.9f + 128);
}
else
{
Console.WriteLine("BAD " + jstart + " " + jend + " sec: " + ((float)i / rate));
samples[i] = (byte)(stotal * 128 * 0.9f + 128);
}
}
BinaryWriter bw = new BinaryWriter(File.OpenWrite("sound"));
for (int i = 0; i < sample_length; i++)
{
bw.Write(samples[i]);
}
bw.Close();
}
}
如果你想编译它,你需要DragonOgg(http://sourceforge.net/projects/dragonogg/)和MathNet.Numerics(http://mathnetnumerics.codeplex.com/)
我希望它可以帮助某些人 - 我不知道默认情况下StackOverflow是如何获得许可的,但这个代码是公共领域。
进一步思考,我决定通过简单地“模糊”样本以获得基本的低通滤波器,我可以更容易地实现近似效果。可以通过减去低通的结果来制作高通滤波器。