我用pyaudio创建了非常简单的语音聊天,但声音有点儿。你经常会听到像老电影那样的噪音。这可能是由于我通过UDP发送的丢失的语音CHUNK引起的。有可能以某种方式降低噪音吗? 此外,我想在用户移动按钮时发出声音,但由于某种原因,不可能合并这两个音轨(音效和声音)!
这是最重要的课程" Sound"。它在线程中运行,因此它可以永久循环运行。
import numpy as np
import pyaudio
import wave # I play the buttons effects from wav file
class Sound():
WIDTH = 2
CHANNELS = 2
RATE = 44100
def __init__(self, parent = None):
super(Sound, self).__init__(parent)
self.voiceStreams= []
self.effectStreams= []
self.vVolume= 1
self.eVolume= 0.5
self.voip = None
self.p = pyaudio.PyAudio()
self.stream = self.p.open(format = self.p.get_format_from_width(Sound.WIDTH),
channels = Sound.CHANNELS,
rate = Sound.RATE,
input = True,
output = True,
#stream_callback = self.callback
)
self.nextSample = ""
self.lastSample = ""
self.stream.start_stream()
def run(self):
while True:
self.myCallback()
def myCallback(self):
_time = time.clock()
if self.nextSample:
self.stream.write(self.nextSample)
self.lastSample = self.nextSample
elif self.lastSample: # I have got some crazy idea, that when there are no data (because UDP doesnt deliver them) I could play the last data, so nobody hears the short silence noise
self.stream.write(self.lastSample)
self.lastSample = ""
_time = time.clock()
#print ("{0:d} ---- {1:d} --- timeWrite: {2:.1f}".format(len(self.voiceStreams), self.stream.get_read_available(), (time.clock() - _time)* 1000) , end = " ")
if self.stream.get_read_available() > 1023:
mic = self.stream.read(1024)
else:
mic = ""
#print ("timeRead: {0:.1f}".format( (time.clock() - _time)* 1000) , end = " ")
if mic and self.voip: self.voip.sendDatagram(mic) #This sends the CHUNK of sound to my UDP client
_time = time.clock()
data = np.zeros(2048, np.int64)
length = len(self.voiceStreams) # I read voice data
l1 = length
for i in range(length):
s = self.voiceStreams.pop(0)
data += s / length * self.vVolume * 0.4 # Here i merge multiple voices with numpy. I also reduce the volume of each voice based on how voices I have...
length = len(self.effectStreams)
toPop= [] # Here i hold indexes of effects which ended playing
for i in range(length):
s = self.effectStreams[i].readframes(1024)
if s == "": # If there are no data to play
toPop.append(i - len(toPop))
else:
d = np.fromstring(s, np.int16)
# Sadly enough each numpy must have same length, so if I get to the end of track, which has only length of 1500 I must throw that away, because numpy doesnt allow me to merge it with array of length 2048
if len(d) > 2047: # And again I merge the sounds with numpy and I reduce the volume
data += (d/ length * length) * self.eVolume * 0.3
for i in toPop: # If I am at the end of track, I delete it
del self.effectStreams[i]
if np.any(data): # If there are any data to read
self.nextSample = data.astype(np.int16).tostring() #I prepare the next CHUNK (should be 20 ms, but I am not sure)
else:
self.nextSample = ""
#print ("timeRest: {0:.1f}".format( (time.clock() - _time)* 1000), end = " || ")
print("HOW MANY CHUNKS OF VOICE I GOT: ", l1)
# It is weird, that when i print the times of reading and writing to stream, it usually prints something like this: (20ms, 20ms, 30ms, 20ms, 20ms, 30ms, 20ms ...)
def close(self):
self.timer.stop()
self.stream.stop_stream()
self.stream.close()
self.p.terminate()
UDP服务器和客户端非常简单(它们运行良好,所以我不在这里发布)。客户端只需将所有数据发送到服务器,服务器就会将所有数据发送到所有客户端。我不告诉任何人,谁发送数据。这意味着如果数据传送得太晚,我将同时从一个客户端播放两个CHUNKS(因为我认为它们来自多个客户端)!
以下是wav文件:Dropbox repository 我没有创建它们,我是从网站http://www.freesound.org/people/ERH/sounds/31135/下载的,它们是根据归因法获得许可的。
!!我还添加了" OUTPUT.txt"进入dropbox文件,其中显示了在两个人之间运行此示例时python打印出来的内容(我只从一个用户获取语音数据)。
感谢您的任何建议。