如何检测和分析音频电平

时间:2014-02-20 23:42:38

标签: python audio record

我正在尝试编写一个脚本,它会监听我的输入并在感知某人是音量时打印“ON”,并在感知到该人已经停止说话时打印“OFF”。

以下是我到目前为止所做的部分工作:

import collections
import audioop
import pyaudio
import time
import math

CHUNK = 1024 # The size of the chunk to read from the mic stream
FORMAT = pyaudio.paInt24 # The format depends on the mic used
CHANNELS = 2 # The number of channels used to record the audio. Depends on the mic
RATE = 44100 # The sample rate for audio. Depends on the mic
THRESHOLD = 6000 # The threshold intensity that defines silence. 
                 # an int lower than THRESHOLD is considered silence 
RECORD_SECONDS=5

def test():
    p = pyaudio.PyAudio()

    stream = p.open(format=FORMAT, channels=CHANNELS,rate=RATE,input=True, frames_per_buffer=CHUNK)

    q = collections.deque(maxlen=RATE/CHUNK)

    flag = True;
    print("--Listening--")  

    for i in range(0,int(RATE/CHUNK * RECORD_SECONDS)):
        data = stream.read(CHUNK)
        q.append(abs(audioop.avg(data,4)))


    print(sum(q)/(RATE/CHUNK*RECORD_SECONDS))

    stream.stop_stream()
    stream.close()
    p.terminate()

def listen():
    p = pyaudio.PyAudio()

    stream = p.open(format=FORMAT, channels=CHANNELS,rate=RATE,input=True, frames_per_buffer=CHUNK)

    q = collections.deque(maxlen=RATE/CHUNK)

    flag = True;
    print("--Listening--")  

    while(True):
        data = stream.read(CHUNK)
        q.append(abs(audioop.avg(data,4)))
        if(flag==True):
            if (sum(q)/(RATE/CHUNK*RECORD_SECONDS)<3500000):
                print("OFF")
                flag=False
        else:
            if(sum(q)/(RATE/CHUNK*RECORD_SECONDS)>3500000):
                print("ON")
                flag=True


if(__name__== '__main__'):
    test()

listen()方法是'ON''OFF'方法,而test()用于检查音频电平。

我不完全确定audioop是正确的方法。在使用'test'方法播放几分钟后,它似乎与音量水平不一致。 我可以从中获得非常高的价值(8,000,000)用于窃窃私语,同时获得4,000,000用于常规谈话,3,000,000用于不谈话(沉默)。

有没有办法让它与音频水平保持一致? 所以我会得到一定的沉默范围,一个较高的一个用于窃窃私语,一个较高的一个用于说话等(即它会保持一致)?

0 个答案:

没有答案