Question

我使用GCS文档中提供的功能，允许我在云存储中转录文本：

range

默认情况下，def transcribe_gcs(gcs_uri): """Asynchronously transcribes the audio file specified by the gcs_uri.""" from google.cloud import speech from google.cloud.speech import enums from google.cloud.speech import types client = speech.SpeechClient() audio = types.RecognitionAudio(uri=gcs_uri) config = types.RecognitionConfig( encoding=enums.RecognitionConfig.AudioEncoding.FLAC, sample_rate_hertz=48000, language_code='en-US') operation = client.long_running_recognize(config, audio) print('Waiting for operation to complete...') response = operation.result(timeout=2000) # Print the first alternative of all the consecutive results. for result in response.results: print('Transcript: {}'.format(result.alternatives[0].transcript)) print('Confidence: {}'.format(result.alternatives[0].confidence)) return ' '.join(result.alternatives[0].transcript for result in response.results)设置为16000.我将其更改为48000，但我一直无法将其设置为更高，例如64k或96k。 48k是采样率的上限吗？

Answer 1

如documentation for Cloud Speech API中所述，48000 Hz确实是此API支持的上限。

支持8000 Hz和48000 Hz之间的采样率 Speech API。

因此，为了使用更高的采样率，您必须重新采样音频文件。

我还要向您推荐this other page，其中可以找到Cloud Speech API支持的功能的基本信息。

Google Speech-to-Text中可能的采样率？

1 个答案: