Question

使用Google Speech API阅读一些演示应用程序后。他们使用文件音频进行演示。

示例：

SpeechClient speech = SpeechClient.create();
// The path to the audio file to transcribe
String fileName = "./resources/RecordAudio.flac";

// Reads the audio file into memory
Path path = Paths.get(fileName);
byte[] data = Files.readAllBytes(path);
ByteString audioBytes = ByteString.copyFrom(data);

// Builds the sync recognize request
RecognitionConfig config = RecognitionConfig.newBuilder()
 .setEncoding(AudioEncoding.FLAC)
 .setSampleRateHertz(16000)
 .setLanguageCode("vi-VI")
 .build();
RecognitionAudio audio = RecognitionAudio.newBuilder()
 .setContent(audioBytes)
 .build();

// Performs speech recognition on the audio file
RecognizeResponse response = speech.recognize(config, audio);
List<SpeechRecognitionResult> results = response.getResultsList();

我没有使用音频文件，而是尝试使用Java Sound API录制音频并将其发送到Google Speech API进行识别。但是在录制音频之后，我在上面的byteOut中获取原始音频数据。

所以我尝试转换byteOut

录音功能：

    ByteArrayOutputStream byteOut = null;

    //Configure audio format
    AudioFormat audioFormat = new AudioFormat(16000,16,1,true,true);

    try {
        microphone = AudioSystem.getTargetDataLine(audioFormat);
        DataLine.Info info = new DataLine.Info(TargetDataLine.class,
                audioFormat);
        microphone = (TargetDataLine) AudioSystem.getLine(info);
        microphone.open(audioFormat);
        // Start recording
        microphone.start();

        // Create thread for recording
        (new Thread(new Runnable() {
            public void run() {
                byteOut = new ByteArrayOutputStream();
                stopRecording = false;
                try {
                    while (!stopRecording) {
                        int count = microphone.read(tempBuffer, 0,
                                tempBuffer.length);
                        if (count > 0) {
                            byteOut.write(tempBuffer, 0, count);
                        }
                    }

                    byteOut.close();
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        })).start();

录制完成后，我获得了记录并保存在变量 byteOut 中的数据。

对于识别数据音频 byteOut ，我尝试将其写入RecordAudio.wav，并将RecordAudio.wav转换为RecordAudio.flac（我使用Audacity进行转换）。最后，使用Google Speech API识别RecordAudio.flac文件。所以我意识到解决方案是缓慢而复杂的。

因此，当阅读上面关于Google Speech API的示例时，我认为不是直接将数据生成后记录而不是音频文件来识别。录制后的音频原始存储在 byteOut 。

但我真的不知道如何在上面的Google Speech API示例中将 byteOut 转换为 ByteString audioBytes 。

ByteString audioBytes = ByteString.copyFrom(data);

任何人都知道解决方案。非常感谢你。

将数据音频从麦克风转换为字节，以识别Google Speech API上的音频

0 个答案: