MS Azure 上麦克风的连续语音识别

时间:2021-05-25 15:43:32

标签: python azure speech-recognition speech-to-text

我想使用 Azure Speech 服务从麦克风进行语音识别。我有一个程序在 Python 中顺利运行,使用recognition_once_async(),但它只能识别具有 15 秒音频限制的第一个话语。我对此主题进行了一些研究,并查看了来自 MS (https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/python/console/speech_sample.py) 的示例代码,但找不到任何可以通过麦克风进行连续语音识别的内容......有什么提示吗?

1 个答案:

答案 0 :(得分:0)

你可以试试下面的代码:

import azure.cognitiveservices.speech as speechsdk
import os
import time

 
path = os.getcwd()
# Creates an instance of a speech config with specified subscription key and service region.
# Replace with your own subscription key and region identifier from here: https://aka.ms/speech/sdkregion
speech_key, service_region = "6.....9", "eastus"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

# Creates a recognizer with the given settings
speech_config.speech_recognition_language="en-US"
#source_language_config = speechsdk.languageconfig.SourceLanguageConfig("en-US", "The Endpoint ID for your custom model.")
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config)

done = False 
def stop_cb(evt):
    print('CLOSING on {}'.format(evt))
    speech_recognizer.stop_continuous_recognition()
    global done
    done= True
    

#Connect callbacks to the events fired by the speech recognizer    
speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
speech_recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
speech_recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))
# stop continuous recognition on either session stopped or canceled events
speech_recognizer.session_stopped.connect(stop_cb)
speech_recognizer.canceled.connect(stop_cb)

speech_recognizer.start_continuous_recognition()

while not done:
    time.sleep(.5)

说明: 默认情况下,当您不提供 audioconfig 时 - 默认输入源是麦克风。

如果你想配置/自定义 - 你可以使用 audioconfig

在连续识别中,有各种事件回调,例如 - Recognizing、Recognized、Cancelled。

输出: enter image description here