音频指纹识别问题与麦克风的流媒体音频输入

时间:2017-11-11 00:19:19

标签: python-2.7 audio signal-processing audio-streaming audio-fingerprinting

我一直在使用Python中的流音频指纹识别器,并尝试使用几个不同的库/服务(Gracenote,AcoustID,ACRCloud),但无法正常工作。

我当前的代码使用python-sounddevice以float32格式通过我的麦克风(或声卡减少噪音)录制原始音频,并填充一个环形缓冲区,然后我每隔5秒读取一次,转换为PCM 16位,提供acoustid指纹功能。

我的代码:

import sounddevice as sd
import time as time2
duration = 15
b = Buffer(duration * RATE)
RATE = 16000
run_time = 60

class Buffer(object):
    def __init__(self, size, dtype=np.float32):
        self.size = size
        self.buf = np.zeros(self.size * 2, dtype=dtype)
        self.i = 0

    def extend(self, data):
        if len(data.shape) > 1:
            raise ValueError("data must be a flat array")

        l = data.size
        if l > self.size:
            raise ValueError("data cannot be larger than size")

        start = (self.i % self.size)
        end = start + l

        start_2 = start + self.size
        end_2 = end + self.size

        self.i += l

        if end < self.buf.size:
            self.buf[start:end] = data


        if end_2 < self.buf.size:
            self.buf[start_2:end_2] = data


    def read(self):
        start = (self.i % self.size)
        end = start + self.size

        return self.buf[start:end]

def float_to_16_bit_pcm(raw_floats):
    floats = array.array('f', raw_floats)
    samples = [sample * 32767 for sample in floats]
    raw_ints = struct.pack("<%dh" % len(samples), *samples)
    return raw_ints

def callback(indata, frames, time, status): #outdata is 5th - when no inputstream

    global run_time
    global i
    if status:
        print(status)

    b.extend(indata.squeeze())
    elapsed_time = time2.time()- start


    if elapsed_time > duration and i % 50 == 0:

        aud = b.read()
        pcm16 = float_to_16_bit_pcm(aud)
        fp = acoustid.fingerprint(16000, 1, pcm16)
        response = acoustid.lookup(API_KEY, fp, 15)
        print response
    i += 1
with sd.InputStream(samplerate=16000, dtype= np.float32, channels=1, callback=callback):

    sd.sleep(int(run_time*1000))

响应回来时出现错误: {u'status':u'error',u'error':{u'message':u'invalid fingerprint',u'code':3}}

我知道Chromaprint的创建者确实提到现在可以将连续音频流馈送到Chromaprint 1.4(用C ++)并在此获取指纹: https://oxygene.sk/2016/12/chromaprint-1-4-released/

有没有人对此提供任何经验或提供建议?

由于

0 个答案:

没有答案