有人可以帮助我在我的IBM演讲中找到文本代码中的错误/

时间:2017-06-12 21:59:22

标签: python websocket speech-to-text ibm-watson

我正在使用websockets向IBM的语音api发送请求,并且我得到一个持续的管道中断错误。 IBM语音到文本api的文档说它可以占用4mb的帧数,但我只能给它七十kb而不会破坏它。 https://www.ibm.com/watson/developercloud/doc/speech-to-text/websockets.html#WSopen 另外,如果我提供70kb(5秒)以下的文件,它的工作代价是不给我任何回报。

    import websocket
    from requests import get
    import user_info
    import json
    import time
    import threading

    api_token = "https://stream.watsonplatform.net/authorization/api/v1/token"
    s2t_url = "wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize"
    s2t_model = 'es-ES_BroadbandModel'
    mb_chunk = 1024*50
    # https://pypi.python.org/pypi/websocket-clien*
    # https://www.ibm.com/watson/developercloud/doc/speech-to-text/websockets.html


    # -------
    # on_open
    # -------
    def on_open(ws):
        """
        Called by the websocet after it is
        opened and sends metadataabout the sound file
        """
        print("--------------WebSocket is open--------------")
        message = {
            'action': 'start',
            'content-type': 'audio/wav'
        }
        #def send_binary(*args):
        ws.send(json.dumps(message))
        i = 0
        with open("Deepak2_hwv4122_uncompressed.wav", "rb") as wav:
            # while True:
            piece = wav.read(mb_chunk)
            ws.send(piece)
            print(i)
            i+=1
            if not piece:
                #break
                pass
            wav.close()
            # ws.close()
        #t = threading.Thread(target=send_binary)
        #t.start()


# ----------
# on_message
# ----------
def on_message(ws, message):
    print("------------------MESSAGE------------------")
    print(message)


# --------
# on_error
# --------
def on_error(ws, error):
    print(error)
    print("------------------ERROR------------------")
# --------
# on_close
# --------
def on_close(ws):
    print("------------Connection is Closed-----------")
    ws.close()

# ----------------
# get_token
# ----------------
def get_token():
    """
    REST request to get the watson voice service API token
    """
    url = api_token + "?url=" + user_info.AUTH['url']
    print("URL: " + url)
    res = get(url, auth=(user_info.AUTH['username'], user_info.AUTH['password']))
    print('Auth Token: ' + res.text)
    return res.text


# ----
# main
# ----
if __name__ == "__main__":
    global ws_url
    cur_token = get_token()
    ws_url = s2t_url + '?watson-token=' + cur_token + '&model=' + s2t_model
    print("ws_uri: " + ws_url)

    # Start WebSocket Connection
    websocket.enableTrace(True)
    ws = websocket.WebSocketApp(ws_url, on_message=on_message, on_error=on_error, on_close=on_close)
    ws.on_open = on_open
    ws.run_forever()
  

我得到的错误是[Errno 32] Broken pipe File   " /home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_app.py" ;,   第268行,在_callback中       回调(self,* args)文件" watson-test.py",第35行,在on_open       ws.send(piece)File" /home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_app.py",   第117行,发送       如果不是self.sock或self.sock.send(数据,操作码)== 0:文件" /home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_core.py&# 34 ;,   第234行,发送       return self.send_frame(frame)File" /home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_core.py",   第259行,在send_frame中       l = self._send(data)File" /home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_core.py",   第423行,在_send       return send(self.sock,data)File" /home/dell/rahmi/env/lib/python3.5/site-packages/websocket/_socket.py",   第116行,发送       返回sock.send(data)文件" /usr/lib/python3.5/ssl.py" ;,第861行,发送       return self._sslobj.write(data)File" /usr/lib/python3.5/ssl.py" ;,第586行,写入       return self._sslobj.write(data)

2 个答案:

答案 0 :(得分:3)

我快速查看了您的代码并发现缺少部分,在推送on_open方法中的所有音频后,您没有发出音频流结束的信号。您可以通过发送空的二进制消息或带有字符串{'action': 'stop'}的文本消息来发出音频结束信号,如下所述:https://www.ibm.com/watson/developercloud/doc/speech-to-text/websockets.html我相信这就是您没有得到任何结果的原因。另外,请确保在服务器回复最终结果之前不要关闭websocket。

谢谢你的回答Sayuri Mizuguchi,我实际上写了https://github.com/watson-developer-cloud/speech-to-text-websockets-python中托管的代码,这是一个通过websockets与Watson STT交互的简单例子。该项目正在此处集成到Watson Python SDK中:https://github.com/watson-developer-cloud/python-sdk

关于转换为base64,您只需要确保音频作为二进制消息发送,websocket堆栈通常能够发送文本消息或二进制消息。

答案 1 :(得分:1)

当我使用websockets时也遇到同样的问题。我真的推荐这个项目,因为你正在尝试。检查here

您在发送音频文件之前是否转换了基础x64?

<s:SciChartSurface XAxis="{Binding CreateAxis}" />

尝试从IBM Developer's

中执行此项目中的以下步骤

Check

ws.send(sound, { binary: true, mask: true});