我一直在尝试编写一个简单的程序来处理文本语音。我用谷歌的API。感谢this源,我可以将其转换并写入文本文件。但与此同时,我需要设置要听的秒数。我想让程序理解演讲已经结束。据我所知,可以将single_utterance设置为true。但是我不知道如何更改我的代码。下面是我发布的代码。
static async Task<object> StreamingMicRecognizeAsync(int seconds)
{
if (NAudio.Wave.WaveIn.DeviceCount < 1)
{
MessageBox.Show("No microphone!");
return -1;
}
var speech = SpeechClient.Create();
var streamingCall = speech.StreamingRecognize();
await streamingCall.WriteAsync(
new StreamingRecognizeRequest()
{
StreamingConfig = new StreamingRecognitionConfig()
{
Config = new RecognitionConfig()
{
Encoding =
RecognitionConfig.Types.AudioEncoding.Linear16,
SampleRateHertz = 16000,
LanguageCode = "tr-TR",
},
InterimResults = false,
SingleUtterance = true,
}
});
// Print responses as they arrive.
Task printResponses = Task.Run(async () =>
{
while (await streamingCall.ResponseStream.MoveNext(
default(CancellationToken)))
{
foreach (var result in streamingCall.ResponseStream
.Current.Results)
{
foreach (var alternative in result.Alternatives)
{
Console.WriteLine(alternative.Transcript);
if (result.IsFinal) {
Console.WriteLine(alternative.Transcript);
// writing results to a text file
using (System.IO.StreamWriter file = new System.IO.StreamWriter(@"C:\env\output.txt", false))
{
file.WriteLine(alternative.Transcript);
}
}
}
}
}
});
// Read from the microphone and stream to API.
object writeLock = new object();
bool writeMore = true;
var waveIn = new NAudio.Wave.WaveInEvent();
waveIn.DeviceNumber = 0;
waveIn.WaveFormat = new NAudio.Wave.WaveFormat(16000, 1);
waveIn.DataAvailable +=
(object sender, NAudio.Wave.WaveInEventArgs args) =>
{
lock (writeLock)
{
if (!writeMore) return;
streamingCall.WriteAsync(
new StreamingRecognizeRequest()
{
AudioContent = Google.Protobuf.ByteString
.CopyFrom(args.Buffer, 0, args.BytesRecorded)
}).Wait();
}
};
waveIn.StartRecording();
await Task.Delay(TimeSpan.FromSeconds(seconds));
// Stop recording and shut down.
waveIn.StopRecording();
lock (writeLock) writeMore = false;
await streamingCall.WriteCompleteAsync();
await printResponses;
Environment.Exit(Environment.ExitCode);
return 0;
}
答案 0 :(得分:0)
答案可能会晚一些,但这是我的做法:
Config = new RecognitionConfig()
{
Encoding =
RecognitionConfig.Types.AudioEncoding.Linear16,
SampleRateHertz = 16000,
LanguageCode = "tr-TR",
},
InterimResults = false,
SingleUtterance = true,
1)像上面一样设置标志single utterance = true。
2)创建变量,例如var IdentificationEnded = false;在任务执行之前
3)在结果循环中捕获事件(以下示例)
Task printResponses = Task.Run(async () =>
{
while (await streamingCall.ResponseStream.MoveNext(
default(CancellationToken)))
{
if (streamingCall.ResponseStream.Current.SpeechEventType ==
StreamingRecognizeResponse.Types.SpeechEventType.EndOfSingleUtterance)
{
recognitionEnded = true;
Debug.WriteLine("End of detection");
}
foreach (var result in streamingCall.ResponseStream
.Current.Results)
{
foreach (var alternative in result.Alternatives)
{
Debug.WriteLine(alternative.Transcript);
}
}
}
});
4)StartRecording之后,只需等待识别结束将其设置为true(以下示例)
Debug.WriteLine("Speak now.");
waveIn.StartRecording();
while(!recognitionEnded) {; }
waveIn.StopRecording();
Debug.WriteLine("End of recording");