Question

我正在尝试首次使用Python制作波表合成器（基于我在https://blamsoft.com/tutorials/expanse-creating-wavetables/处找到的示例），但是我得到的合成声音根本听不到音调。我的输出只是低沉的嗡嗡声。我对使用Python制作波表非常陌生，并且我想知道是否有人能够告诉我我所缺少的，以便将A440正弦波表写入文件“ wavetable.wav”并实际使用它产生纯正弦音？这是我目前的状态：

import wave
import struct
import numpy as np

frame_count = 256
frame_size = 2048
sps = 44100
freq_hz = 440
file = "wavetable.wav" #write waveform to file

wav_file = wave.open(file, 'w')
wav_file.setparams((1, 2, sps, frame_count, 'NONE', 'not compressed'))


values = bytes(0)

for i in range(frame_count):
    for ii in range(frame_size):
       
        sample = np.sin((float(ii)/frame_size) * (i+128)/256 * 2 * np.pi * freq_hz/sps) * 65535 
        
        
        if sample < 0:
            sample = 0
        
        sample -= 32768
        sample = int(sample)

        values += struct.pack('h', sample) 

wav_file.writeframes(values)
wav_file.close()

print("Generated " + file)

我在for循环中拥有的正弦函数可能是我最不了解的部分，因为我只是逐字逐句地进行示例。我习惯于制作（y = Asin（2πfx））之类的正弦函数，但是我不确定将（（i + 128）/ 256）和65535（16位幅度分辨率？）相乘的目的是什么。我也不确定从每个样本中减去32768的目的是什么。有谁能够澄清我所缺少的，甚至可以指出正确的方向？我会以错误的方式处理吗？任何帮助表示赞赏！

Answer 1

如果您只是想提前生成声音数据，然后将其全部转储到文件中，并且您也习惯于使用 NumPy，我建议您将其与 SoundFile 之类的库一起使用。这样就不需要将数据分隔成帧了。

从一种幼稚的方法开始（使用 numpy.sin，尚未尝试优化事物），以这样的方式结束：

from math import tau
import numpy as np
import soundfile as sf

file_path = 'sine.flac'
sample_rate = 48_000   # hertz
duration = 1.0         # seconds
frequency = 432.0      # hertz
amplitude = 0.8        # (not in decibels!)
start_phase = 0.0      # at what phase to start

sample_count = floor(sample_rate * duration)

# cyclical frequency in sample^-1
omega = frequency * tau / sample_rate

# all phases for which we want to sample our sine
phases = np.linspace(start_phase, start_phase + omega * sample_count,
                     sample_count, endpoint=False)

# our sine wave samples, generated all at once
audio = amplitude * np.sin(phases)

# now write to file
fmt, sub = 'FLAC', 'PCM_24'
assert sf.check_format(fmt, sub) # to make sure we ask the correct thing beforehand
sf.write(file_path, audio, sample_rate, format=fmt, subtype=sub)

这将是单声道声音，您可以使用二维数组编写立体声（请参阅 NumPy 和 SoundFile 的文档）。

但请注意，要专门制作一个波表，您需要确保它只包含一个波形的单个周期（或整数个周期），这样波表的播放将没有点击并且具有正确的频率。

您也可以在 Python 中实时播放分块声音，使用 PyAudio 之类的东西。（我还没有使用过，所以至少有一段时间这个答案会缺少与此相关的代码。）

最后，坦率地说，以上所有内容都与从波表生成声音数据无关：您只需从某处选择一个波表，这对实际合成没有太大作用。这是一个简单的启动算法。假设您要播放一大块 sample_count 样本，并在 wavetable 中存储了一个波表，这是一个完美循环且已标准化的单个周期。并假设您当前的波相位为 start_phase，频率为 frequency，采样率为 sample_rate，幅度为 amplitude。然后：

# indices for the wavetable values; this is just for `np.interp` to work
wavetable_period = float(len(wavetable))
wavetable_indices = np.linspace(0, wavetable_period,
                                len(wavetable), endpoint=False)

# frequency of the wavetable played at native resolution
wavetable_freq = sample_rate / wavetable_period

# start index into the wavetable
start_index = start_phase * wavetable_period / tau

# code above you run just once at initialization of this wavetable ↑
# code below is run for each audio chunk ↓

# samples of wavetable per output sample
shift = frequency / wavetable_freq

# fractional indices into the wavetable
indices = np.linspace(start_index, start_index + shift * sample_count,
                      sample_count, endpoint=False)

# linearly interpolated wavetavle sampled at our frequency
audio = np.interp(indices, wavetable_indices, wavetable,
                  period=wavetable_period)
audio *= amplitude

# at last, update `start_index` for the next chunk
start_index += shift * sample_count

然后你输出音频。尽管有更好的方法来回放波表，但线性插值至少是一个好的开始。这种方法也可以实现频率滑动：只需以另一种方式计算 indices，不再均匀间隔。

第一次制作波表合成器...有人可以指出我正确的方向吗？

1 个答案: