如何向ibm-watson文本到语音API发出多个请求?

时间:2019-07-19 05:45:08

标签: node.js ibm-watson

期望的行为

将多个请求发送到ibm-watson,等待它们完成,然后连接音频并通过res.download()发送回客户端。

我在一个请求中成功完成了sending a file back to the client,但无法弄清楚如何合并Promise.all(promises)来发送和接收多个请求和响应。

我尝试过的事情

ibm-watson的{​​{1}}方法的文本限制为5kb。

https://cloud.ibm.com/apidocs/text-to-speech?code=node#synthesize-audio

如果我的输入文本超过5kb,我会将其分块为存储在数组中的较小字符串:

synthesize()

我需要将每个字符串发送到var text_chunk_array = ["string 1", "string 2", "string 3"]; 并等待它们全部完成。

我的方法是使用Promise.all()的某些变体。

这是一组用于测试的值(共717个字符):

ibm-watson

尝试01

映射到var text_chunk_array = [`<speak version='1.0'><prosody rate='-15%'>Here is a rather long piece of text that we could speak in order to recreate the problem.</prosody></speak>`, `<speak version='1.0'><prosody rate='-15%'>The text, it seems, may have some glitches in the sound quality.</prosody></speak>`, `<speak version='1.0'><prosody rate='-15%'>I am wondering what the solution is to these glitches in sound.</prosody></speak>`, `<speak version='1.0'><prosody rate='-15%'>And I am also wondering if this array will be enough to recreate the problem</prosody></speak>`, `<speak version='1.0'><prosody rate='-15%'>And here is another fairly long, actually very very very long, sentence to ensure that the problem can, in fact, be recreated.</prosody></speak>`]; 上,并为每个字符串(inspired by this)调用text_chunk_array方法。

我将每个呼叫都封装在synthesize()中,因为……我不确定该怎么做,但是文档指出默认情况下会返回一个promise,所以不确定是否需要这样做:

  

所有SDK方法都是异步的,因为它们正在发出网络请求   屈臣氏服务。为了处理从这些请求中接收数据,   SDK提供对Promises和Callback功能的支持。一种   默认情况下将返回Promise,除非回调函数为   提供。

来源:https://github.com/watson-developer-cloud/node-sdk#callbacks-vs-promises

return new Promise()

错误

var promises = text_chunk_array.map(text_chunk => {

    return new Promise((resolve, reject) => {

        var textToSpeech = new TextToSpeechV1({
            iam_apikey: local_settings.TEXT_TO_SPEECH_IAM_APIKEY,
            url: local_settings.TEXT_TO_SPEECH_URL
        });

        var synthesizeParams = {
            text: text_chunk,
            accept: 'audio/mp3',
            voice: 'en-US_AllisonV3Voice'
        };

        textToSpeech.synthesize(synthesizeParams, (err, audio) => {
            if (err) {
                console.log("oopsie, an error occurred: " + err); 
                return reject(err);
            }
            resolve(audio);
        });

    });
});

// wait till all promises have finished  
try {
var audio_files = await Promise.all(promises);

if (audio_files.length === 1) {
    console.log("there is only 1 audio file in the array");
    var audio = audio_files[0];
} else {
    // to do:  concatenate the audio objects, somehow...
    console.log("there is more than 1 audio file in the array");
}

// BEGIN write audio to file, send it to client, then delete it from server
var absPath = path.join(__dirname, "/my_files/", file_name);
var relPath = path.join("./my_files", file_name); // path relative to server root

var write_stream = fs.createWriteStream(relPath);
audio.pipe(write_stream);

write_stream.on('finish', function() {

    // download the file (using absPath)  
    res.download(absPath, (err) => {
        if (err) {
            console.log(err);
        }
        // delete the file (using relPath)  
        fs.unlink(relPath, (err) => {
            if (err) {
                console.log(err);
            }
            console.log("FILE [" + file_name + "] REMOVED!");
        });
    });

});
// END write audio to file, send it to client, then delete it from server
    } catch (err) {
        console.log("hola, there was an error: " + err);
    }

尝试02

与第一个相同,而没有将内容包装在hola, there was an error: TypeError: Cannot read property 'pipe' of undefined 中:

return new Promise()

错误

var promises = text_chunk_array.map(text_chunk => {

    var textToSpeech = new TextToSpeechV1({
        iam_apikey: local_settings.TEXT_TO_SPEECH_IAM_APIKEY,
        url: local_settings.TEXT_TO_SPEECH_URL
    });

    var synthesizeParams = {
        text: text_chunk,
        accept: 'audio/mp3',
        voice: 'en-US_AllisonV3Voice'
    };

    textToSpeech
        .synthesize(synthesizeParams)
        .then(audio => {
            return audio;
        })
        .catch(err => {
            console.log("oopsie, an error occurred: " + err);
        });

});

// followed by the same code after 'wait till all promises have finished' in attempt 01    

尝试03

尝试01 和Terry Lennox的hola, there was an error: TypeError: Cannot read property 'pipe' of undefined solution的组合。

到目前为止效果最好,没有错误,但是一些音频播放器将结尾部分截短了。

CombinedStream

1 个答案:

答案 0 :(得分:1)

我可以通过串联客户端库返回的输出流来实现此目的。

const textToSpeech = new TextToSpeechV1({
    iam_apikey: local_settings.TEXT_TO_SPEECH_IAM_APIKEY,
    url: local_settings.TEXT_TO_SPEECH_URL
});

function synthesizeText(text) {
    const synthesizeParams = {
        text: text,
        accept: 'audio/mp3',
        voice: 'en-GB_KateVoice'
    };
    return textToSpeech.synthesize(synthesizeParams);
}

async function synthesizeTextChunks(text_chunks, outputFile) {
    const audioArray = [];
    let blockNumber = 0;
    // I'm using for .. of to ensure we keep the sequence intact.. we could use async library to do something similar..
    for(let text of text_chunks) {
        console.log(`Synthesizing block #${++blockNumber}...`);
        audioArray.push(await synthesizeText(text));
    }
    console.log(`synthesizeTextChunks: Received ${audioArray.length} result(s), writing to file ${outputFile}...`);
    const outputStream = fs.createWriteStream(outputFile);
    audioArray.forEach((audio, index) => {
        // Pipe the audio stream to the output file stream. The output stream will be closed when the last file has been piped..
        audio.pipe(outputStream, {end: (index === (audioArray.length-1))});
    });
}

const textChunks = ["She's comin' on", "boys and she's coming", "on strong"];
synthesizeTextChunks(textChunks, "output.mp3");

这是synthesizeTextChunks代码的另一种形式,这次使用combined-stream库和Promise.all:

const CombinedStream = require('combined-stream');

async function synthesizeTextChunks(text_chunks, outputFile) {
    const audioArray = await Promise.all(text_chunks.map(synthesizeText));
    console.log(`synthesizeTextChunks: Received ${audioArray.length} result(s), writing to file ${outputFile}...`);
    const combinedStream = CombinedStream.create();
    audioArray.forEach((audio) => {
        combinedStream.append(audio);
    });
    combinedStream.pipe(fs.createWriteStream(outputFile));
}

要查看是否有音频毛刺是由单独的合成或流重组引起的,我们可以将每个块流式传输到一个单独的文件中,如下所示:

async function synthesizeTextChunksSeparateFiles(text_chunks) {
    const audioArray = await Promise.all(text_chunks.map(synthesizeText));
    console.log(`synthesizeTextChunks: Received ${audioArray.length} result(s), writing to separate files...`);
    audioArray.forEach((audio, index) => {
        audio.pipe(fs.createWriteStream(`audio-${index}.mp3`));
    });
}

要合并.mp3文件,也许您可​​以使用node-fluent-ffmpeg库-这确实需要安装ffmpeg-通过使用此工具合并音频文件,我可以获得更好的音频质量:

function combineMp3Files(files, outputFile) {
    const ffmpeg = require("fluent-ffmpeg");
    const combiner = ffmpeg().on("error", err => {
        console.error("An error occurred: " + err.message);
    })
    .on("end", () => {
        console.log('Merge complete');
    });

    // Add in each .mp3 file.
    files.forEach(file => {
        combiner.input(file)
    });

    combiner.mergeToFile(outputFile); 
}