Google Speech API会为某些人返回空结果,但不会返回其他人(C#)

时间:2015-03-23 18:56:41

标签: c# json speech-recognition flac

EMPTY结果如下所示:

json[0] "{\"result\":[]}"
json[1] ""

NON-EMPTY结果(所需结果)如下所示:

json[0] "{\"result\":[]}"
json[1] "{\"result\":[{\"alternative\":[{\"transcript\":\"good morning Google how are you feeling today\",\"confidence\":0.987629}],\"final\":true}],\"result_index\":0}"
json[2] ""

我有这个功能,应该采用“.flac”文件并将其转换为文本。出于某种原因,只有这两个样本“.flac”文件在通过Google Speech API时返回一个字符串,其他flac文件返回EMPTY结果。 这些人有同样的问题:link

以下是我的所有flac文件:link

my.flacthis_is_a_test.flac完美运行,谷歌语音API  给我一个jason对象,里面有文字。

但是,recorded.flac不适用于谷歌语音API并给出  我EMPTY json对象。

调试:

  1. 我认为这是问题的麦克风,而且我 多次录制recorded.flac,声音清晰,转换 它使用ffmpeg进行flac。但谷歌语音API仍然无法识别 recorded.flac
  2. 我以为我的代码格式错误,所以我试过

      

    _HWR_SpeechToText.ContentType =“audio / 116; rate = 16000”;

    而不是

  3.   

    _HWR_SpeechToText.ContentType =“audio / x-flac; rate = 44100”;

    Then, none of them worked, not a single flac file. so i changed it back.
    

    以下是我的google语音API代码,它将FLAC文件转换为TEXT(我认为没有必要,但无论如何):

    public void convert_to_text()
        {
            FileStream fileStream = File.OpenRead("recorded.flac");//my.flac
            MemoryStream memoryStream = new MemoryStream();
            memoryStream.SetLength(fileStream.Length);
            fileStream.Read(memoryStream.GetBuffer(), 0, (int)fileStream.Length);
            byte[] BA_AudioFile = memoryStream.GetBuffer();
            HttpWebRequest _HWR_SpeechToText = null;
            _HWR_SpeechToText = (HttpWebRequest)HttpWebRequest.Create("https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=" + ACCESS_GOOGLE_SPEECH_KEY);
            _HWR_SpeechToText.Credentials = CredentialCache.DefaultCredentials;
            _HWR_SpeechToText.Method = "POST";
            _HWR_SpeechToText.ContentType = "audio/x-flac; rate=44100";
            _HWR_SpeechToText.ContentLength = BA_AudioFile.Length;
            Stream stream = _HWR_SpeechToText.GetRequestStream();
            stream.Write(BA_AudioFile, 0, BA_AudioFile.Length);
            stream.Close();
            HttpWebResponse HWR_Response = (HttpWebResponse)_HWR_SpeechToText.GetResponse();
    
            StreamReader SR_Response = new StreamReader(HWR_Response.GetResponseStream());
            string responseFromServer = (SR_Response.ReadToEnd());
    
            String[] jsons = responseFromServer.Split('\n');
            foreach (String j in jsons)
            {
                dynamic jsonObject = JsonConvert.DeserializeObject(j);
                if (jsonObject == null || jsonObject.result.Count <= 0)
                {
                    continue;
                }
                text = jsonObject.result[0].alternative[0].transcript;
                jsons = null;
            }
            label1.Content = text;
        }
    

1 个答案:

答案 0 :(得分:1)

首先检查文件是16位PCM Mono而不是立体声。使用http://www.audacityteam.org/

很容易

enter image description here

然后您可以使用这个简单的代码来执行此操作:

string api_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
string path = @"C:\temp\good-morning-google.flac";

byte[] bytes = System.IO.File.ReadAllBytes(path);

WebClient client = new WebClient();
client.Headers.Add("Content-Type", "audio/x-flac; rate=44100");
byte[] result = client.UploadData(string.Format(
                "https://www.google.com/speech-api/v2/recognize?client=chromium&lang=en-us&key={0}", api_key), "POST", bytes);

string s = client.Encoding.GetString(result);