我的.wav文件长度仅为4秒。即使经过多次重试并在云上运行,我仍然会遇到以下错误
* upload completely sent off: 12 out of 12 bytes
< HTTP/1.1 408 Request timed out (> 14000 ms)
< Transfer-Encoding: chunked
< Content-Type: text/plain
< Server: Microsoft-IIS/8.5
< X-MSEdge-Ref:
有人遇到过这个问题吗? 这是我的要求
`curl -v "https://speech.platform.bing.com/recognize?
scenarios=catsearch&appid=D4D52672-91D7-4C74-8AD8-42B1D98141A5&locale=en-
US&device.os=wp7&version=3.0&format=json&requestid=1d4b6030-9099-12e0-91e4-
0800200c9a67&instanceid=1d4b6030-9099-12e0-91e5-0800200c9a68" -H
"Authorization: Bearer $1" -H "Content-Type: audio/wav; samplerate=8000" --
data-binary $2`
答案 0 :(得分:2)
我也遇到了一些问题让它发挥作用。以下BASH脚本&#34; bingrec.sh&#34;可能有助于使其更清晰;输入您的SUBSCRIPTION_KEY&amp;根据需要调整SAMPLERATE等。正如其他人所指出的那样,locale&amp;场景需要设置为支持的值,instance_id和request_id需要采用GUID格式。音频文件的长度应小于10秒,采样率为8000或16000.此外还有卷曲&#34; - 数据二进制&#34;参数需要&#34; @&#34;在音频文件名前面。
#!/bin/bash
# Usage: ./bingrec.sh /path/to/file
# Send audio file $1 through Bing speech recognition API.
#
SUBSCRIPTION_KEY=<your-key-here>
LOCALE=en-US
SCENARIOS=ulm
SAMPLERATE=8000
CODEC=audio/pcm
TARGET_FILE=$1
if [ ! -f "$TARGET_FILE" ]; then
echo Error: file $TARGET_FILE does not exist!
exit 1
fi
INSTANCE_ID=`uuidgen` # random GUID for instance
REQUEST_ID=`uuidgen` # random GUID for request
APPID=D4D52672-91D7-4C74-8AD8-42B1D98141A5 # APPID for Bing Speechrec API, don't change
DEVICE_OS=linux # arbitraty
FORMAT=json
AUTH_TOKEN=`curl -v -X POST "https://api.cognitive.microsoft.com/sts/v1.0/issueToken" -H "Content-type: application/x-www-form-urlencoded" -H "Content-Length: 0" -H "Ocp-Apim-Subscription-Key: ${SUBSCRIPTION_KEY}"`
curl -v -X POST "https://speech.platform.bing.com/recognize?scenarios=${SCENARIOS}&appid=${APPID}&locale=${LOCALE}&device.os=${DEVICE_OS}&version=3.0&format=${FORMAT}&instanceid=${INSTANCE_ID}&requestid=${REQUEST_ID}" -H "Authorization: Bearer ${AUTH_TOKEN}" -H "Content-type: audio/wav; codec='${CODEC}'; samplerate=${SAMPLERATE}" --data-binary @${TARGET_FILE}
答案 1 :(得分:0)
我得到了这个工作。有几个问题。一个是语言环境,我改为en-IN。然后scenario = ulm。这似乎已经成功了。我能够非常清楚地发现语音。