如何将SSML合并到Python中

时间:2016-04-14 20:44:57

标签: python alexa alexa-skills-kit ssml

我需要使用SSML在我的Alexa技能中播放带有标签的音频文件(根据亚马逊的说明)。

问题是,我不知道如何在Python中使用SSML。我知道我可以在Java中使用它但我想用Python构建我的技能。我看了一遍,但是在Python脚本/程序中没有找到任何SSML的工作示例 - 有谁知道?

6 个答案:

答案 0 :(得分:3)

SSML音频位于response.outputSpeech.ssml属性中。这是一个 删除了其他必需参数的示例obj:

{
 "response": {
    "outputSpeech": {
      "type": "SSML",
      "ssml": "<speak>
              Welcome to Car-Fu.
              <audio src="https://carfu.com/audio/carfu-welcome.mp3" />
              You can order a ride, or request a fare estimate. Which will it be?
              </speak>"
    }
}

进一步参考:

答案 1 :(得分:3)

这是两年前提出的,但也许有人会从以下内容中受益。

我刚刚检查了一下,如果您使用Alexa Skills Kit SDK for Python,则可以简单地将SSML添加到您的响应中,例如:

@sb.request_handler(can_handle_func=is_request_type("LaunchRequest"))
def launch_request_handler(handler_input):

    speech_text = "Wait for it 3 seconds<break time="3s"/> Buuuu!"

    return handler_input.response_builder.speak(speech_text).response

希望这会有所帮助。

答案 2 :(得分:2)

这些注释确实有助于弄清楚如何使用ask-sdk-python使SSML正常工作。代替

speech_text = "Wait for it 3 seconds<break time="3s"/> Buuuu!" - from wmatt's comment

我定义了代表我正在使用的每个标签的开始和结束的变量

ssml_start = '<speak>'
speech_text = ssml_start + whispered_s + "Here are the latest alerts from MMDA" + whispered_e

使用单引号并将这些字符串连接到语音输出中,就可以了!非常感谢你们!非常感谢!

答案 3 :(得分:1)

安装ssml-builder“ pip install ssml-builder”,并使用它:

from ssml_builder.core import Speech

speech = Speech()
speech.add_text('sample text')
ssml = speech.speak()
print(ssml)

答案 4 :(得分:0)

这个问题有点模糊,但我确实设法弄清楚如何将SSML合并到Python脚本中。这是一段播放音频的片段:

  if 'Item' in intent['slots']:
    chosen_item = intent['slots']['Item']['value']
    session_attributes = create_attributes(chosen_item)

    speech_output =  '<speak> Here is something to play' + \
    chosen_item + \
    '<audio src="https://s3.amazonaws.com/example/example.mp3" /> </speak>'

答案 5 :(得分:0)

python的ssml包存在。

你可以通过pip

安装如下


    $ pip install pyssml
    or
    $ pip3 install pyssml


所以示例是下面的链接

http://blog.naver.com/chandong83/221145083125 抱歉。它是韩国人。



    # -*- coding: utf-8 -*-
    # for amazon
    import re
    import os
    import sys
    import time
    from boto3 import client
    from botocore.exceptions import BotoCoreError, ClientError
    import vlc
    from pyssml.PySSML import PySSML


    # amazon service fuction
    # if isSSML is True, SSML format
    # else Text format
    def aws_polly(text, isSSML = False):
        voiceid = 'Joanna'

        try:
            polly = client("polly", region_name="ap-northeast-2")

            if isSSML:
                textType = 'ssml'
            else:
                textType = 'text'

            response = polly.synthesize_speech(
                    TextType=textType,
                    Text=text,
                    OutputFormat="mp3",
                    VoiceId=voiceid)

            # get Audio Stream (mp3 format)
            stream = response.get("AudioStream")

            # save the audio Stream File
            with open('aws_test_tts.mp3', 'wb') as f:
                data = stream.read()
                f.write(data)


            # VLC play audio
            # non block
            p = vlc.MediaPlayer('./aws_test_tts.mp3')
            p.play()

        except ( BotoCoreError, ClientError) as err:
            print(str(err))


    if __name__ == '__main__':
        # normal pyssml
        #s = PySSML()

        # amazon speech ssml
        s = AmazonSpeech()

        # normal 
        s.say('i am normal')

        #  speed is very slow
        s.prosody({'rate':"x-slow"}, 'i am very slow')

        #  volume is very loud
        s.prosody({'volume':'x-loud'}, 'my voice is very loud')

        #  take a one sec
        s.pause('1s')

        #  pitch is very high
        s.prosody({'pitch':'x-high'}, 'my tone is very high')

        # amazone 
        s.whisper('i am whispering')
        # print to convert to ssml format
        print(s.ssml())

        # request aws polly and play
        aws_polly(s.ssml(), True)

        # Wait while playback.
        time.sleep(50)