我正在python中进行信号处理项目。到目前为止,我在非阻塞模式下取得了一些成功,但它给输出带来了相当大的延迟和削波。
我想使用Pyaudio和Scipy.Signal实现一个简单的实时音频过滤器,但是在pyaudio示例中提供的回调函数中,当我想读取in_data时我无法处理它。尝试以各种方式转换它但没有成功。
这是我想要实现的代码(尽快从麦克风,过滤器和输出中读取数据):
import pyaudio
import time
import numpy as np
import scipy.signal as signal
WIDTH = 2
CHANNELS = 2
RATE = 44100
p = pyaudio.PyAudio()
b,a=signal.iirdesign(0.03,0.07,5,40)
fulldata = np.array([])
def callback(in_data, frame_count, time_info, status):
data=signal.lfilter(b,a,in_data)
return (data, pyaudio.paContinue)
stream = p.open(format=pyaudio.paFloat32,
channels=CHANNELS,
rate=RATE,
output=True,
input=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(5)
stream.stop_stream()
stream.close()
p.terminate()
这样做的正确方法是什么?
答案 0 :(得分:5)
在此期间找到了我的问题的答案,回调看起来像这样:
def callback(in_data, frame_count, time_info, flag):
global b,a,fulldata #global variables for filter coefficients and array
audio_data = np.fromstring(in_data, dtype=np.float32)
#do whatever with data, in my case I want to hear my data filtered in realtime
audio_data = signal.filtfilt(b,a,audio_data,padlen=200).astype(np.float32).tostring()
fulldata = np.append(fulldata,audio_data) #saves filtered data in an array
return (audio_data, pyaudio.paContinue)
答案 1 :(得分:0)
我尝试使用PyAudio回调模式时遇到类似的问题,但我的要求是:
经过几次尝试,我成功了,这是我的代码片段(基于发现的here的PyAudio示例):
import pyaudio
import scipy.signal as ss
import numpy as np
import librosa
track1_data, track1_rate = librosa.load('path/to/wav/track1', sr=44.1e3, dtype=np.float64)
track2_data, track2_rate = librosa.load('path/to/wav/track2', sr=44.1e3, dtype=np.float64)
track3_data, track3_rate = librosa.load('path/to/wav/track3', sr=44.1e3, dtype=np.float64)
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
count = 0
IR_left = first_IR_left # Replace for actual IR
IR_right = first_IR_right # Replace for actual IR
# define callback (2)
def callback(in_data, frame_count, time_info, status):
global count
track1_frame = track1_data[frame_count*count : frame_count*(count+1)]
track2_frame = track2_data[frame_count*count : frame_count*(count+1)]
track3_frame = track3_data[frame_count*count : frame_count*(count+1)]
track1_left = ss.fftconvolve(track1_frame, IR_left)
track1_right = ss.fftconvolve(track1_frame, IR_right)
track2_left = ss.fftconvolve(track2_frame, IR_left)
track2_right = ss.fftconvolve(track2_frame, IR_right)
track3_left = ss.fftconvolve(track3_frame, IR_left)
track3_right = ss.fftconvolve(track3_frame, IR_right)
track_left = 1/3 * track1_left + 1/3 * track2_left + 1/3 * track3_left
track_right = 1/3 * track1_right + 1/3 * track2_right + 1/3 * track3_right
ret_data = np.empty((track_left.size + track_right.size), dtype=track1_left.dtype)
ret_data[1::2] = br_left
ret_data[0::2] = br_right
ret_data = ret_data.astype(np.float32).tostring()
count += 1
return (ret_data, pyaudio.paContinue)
# open stream using callback (3)
stream = p.open(format=pyaudio.paFloat32,
channels=2,
rate=int(track1_rate),
output=True,
stream_callback=callback,
frames_per_buffer=2**16)
# start the stream (4)
stream.start_stream()
# wait for stream to finish (5)
while_count = 0
while stream.is_active():
while_count += 1
if while_count % 3 == 0:
IR_left = first_IR_left # Replace for actual IR
IR_right = first_IR_right # Replace for actual IR
elif while_count % 3 == 1:
IR_left = second_IR_left # Replace for actual IR
IR_right = second_IR_right # Replace for actual IR
elif while_count % 3 == 2:
IR_left = third_IR_left # Replace for actual IR
IR_right = third_IR_right # Replace for actual IR
time.sleep(10)
# stop stream (6)
stream.stop_stream()
stream.close()
# close PyAudio (7)
p.terminate()
以下是关于上面代码的一些重要反映:
librosa
而不是wave可以让我使用numpy数组进行处理,这比wave.readframes
中的数据块要好得多。p.open(format=
中设置的数据类型必须与ret_data
字节的格式匹配。而且PyAudio最多只能与float32
一起使用。ret_data
中的偶数索引字节进入右耳机,而奇数索引字节进入左耳机。为澄清起见,此代码将三个音轨的混合发送到立体声输出音频,每10秒钟更改一次脉冲响应,并因此应用滤波器。 我用它来测试我正在开发的3d音频应用程序,因此在头部相关的脉冲响应(HRIR)的情况下,脉冲响应会每10秒更改一次声音的位置。
编辑:
该代码有一个问题:输出的噪声频率与帧的大小相对应(当帧的大小较小时,频率较高)。我通过手动重叠和添加框架来解决此问题。基本上,ss.oaconvolve
返回了一个大小为track_frame.size + IR.size - 1
的数组,因此我将该数组分为前track_frame.size
个元素(然后用于ret_data
),然后是最后一个我保存供以后使用的IR.size - 1
个元素。然后将那些保存的元素添加到下一帧的前IR.size - 1
个元素中。第一帧加零。