Question

这个问题主要是关于Microsoft的Speech API（SAPI）对服务器工作负载的适用性，以及它是否可以在 w3wp 内可靠地用于语音合成。我们有一个异步控制器使用.NET 4中的本地System.Speech程序集（不是作为Microsoft Speech Platform - Runtime Version 11的一部分提供的Microsoft.Speech程序集）和 lame.exe到生成mp3，如下所示：

       [CacheFilter]
        public void ListenAsync(string url)
        {
                string fileName = string.Format(@"C:\test\{0}.wav", Guid.NewGuid());                       

                try
                {
                    var t = new System.Threading.Thread(() =>
                    {
                        using (SpeechSynthesizer ss = new SpeechSynthesizer())
                        {
                            ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050, AudioBitsPerSample.Eight, AudioChannel.Mono));
                            ss.Speak("Here is a test sentence...");
                            ss.SetOutputToNull();
                            ss.Dispose();
                        }

                        var process = new Process() { EnableRaisingEvents = true };
                        process.StartInfo.FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"bin\lame.exe");
                        process.StartInfo.Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3"));
                        process.StartInfo.UseShellExecute = false;
                        process.StartInfo.RedirectStandardOutput = false;
                        process.StartInfo.RedirectStandardError = false;
                        process.Exited += (sender, e) =>
                        {
                            System.IO.File.Delete(fileName);

                            AsyncManager.OutstandingOperations.Decrement();
                        };

                        AsyncManager.OutstandingOperations.Increment();
                        process.Start();
                    });

                    t.Start();
                    t.Join();
                }
                catch { }

            AsyncManager.Parameters["fileName"] = fileName;
        }

        public FileResult ListenCompleted(string fileName)
        {
            return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
        }

问题是为什么SpeechSynthesizer需要在一个单独的线程上运行才能返回（这在SO here和here的其他地方报告）以及是否实现{ {3}}此请求比上述方法更有效/可扩展吗？

其次，在ASP.NET（MVC或WebForms）上下文中运行SpeakAsync的选项有哪些？我尝试过的选项似乎都没有效果（请参阅下面的更新）。

有关如何改进此模式的任何其他建议（即必须相互串行执行但每个都具有异步支持的两个依赖项）是受欢迎的。我觉得这个方案在负载下是不可持续的，特别是考虑到SpeechSynthesizer中的STAThreadRouteHandler。考虑在不同的堆栈上一起运行此服务。

更新 Speak或SpeakAsnc选项似乎都不在STAThreadRouteHandler下工作。前者产生：

System.InvalidOperationException：异步操作不是在这种情况下允许。页面启动异步操作有将Async属性设置为true和异步操作只能在PreRenderComplete事件之前的页面上启动。在 System.Web.LegacyAspNetSynchronizationContext.OperationStarted（）at System.ComponentModel.AsyncOperationManager.CreateOperation（对象 userSuppliedState）at System.Speech.Internal.Synthesis.VoiceSynthesis..ctor（WeakReference的 speechSynthesizer）at System.Speech.Synthesis.SpeechSynthesizer.get_VoiceSynthesizer（）at System.Speech.Synthesis.SpeechSynthesizer.SetOutputToWaveFile（字符串 path，SpeechAudioFormatInfo formatInfo）

后者导致：

System.InvalidOperationException：异步操作方法 'Listen'无法同步执行。在 System.Web.Mvc.Async.AsyncActionDescriptor.Execute（ControllerContext controllerContext，IDictionary`2参数）

似乎自定义STA线程池（COM对象的ThreadStatic个实例）是更好的方法：known memory leaks

更新＃2 ：似乎System.Speech.SpeechSynthesizer似乎不需要STA处理，只要您遵循Start/Join模式，似乎在MTA线程上运行良好。这是一个能够正确使用SpeakAsync的新版本（问题是过早处理它！）并将WAV生成和MP3生成分解为两个单独的请求：

[CacheFilter]
[ActionName("listen-to-text")]
public void ListenToTextAsync(string text)
{
    AsyncManager.OutstandingOperations.Increment();   

    var t = new Thread(() =>
    {
        SpeechSynthesizer ss = new SpeechSynthesizer();
        string fileName = string.Format(@"C:\test\{0}.wav", Guid.NewGuid());

        ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050,
                                                                   AudioBitsPerSample.Eight,
                                                                   AudioChannel.Mono));
        ss.SpeakCompleted += (sender, e) =>
        {
            ss.SetOutputToNull();
            ss.Dispose();

            AsyncManager.Parameters["fileName"] = fileName;
            AsyncManager.OutstandingOperations.Decrement();
        };

        CustomPromptBuilder pb = new CustomPromptBuilder(settings.DefaultVoiceName);
        pb.AppendParagraphText(text);
        ss.SpeakAsync(pb);               
    });

    t.Start();
    t.Join();                    
}

[CacheFilter]
public ActionResult ListenToTextCompleted(string fileName)
{
    return RedirectToAction("mp3", new { fileName = fileName });
}

[CacheFilter]
[ActionName("mp3")]
public void Mp3Async(string fileName) 
{
    var process = new Process()
    {
        EnableRaisingEvents = true,
        StartInfo = new ProcessStartInfo()
        {
            FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"bin\lame.exe"),
            Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3")),
            UseShellExecute = false,
            RedirectStandardOutput = false,
            RedirectStandardError = false
        }
    };

    process.Exited += (sender, e) =>
    {
        System.IO.File.Delete(fileName);
        AsyncManager.Parameters["fileName"] = fileName;
        AsyncManager.OutstandingOperations.Decrement();
    };

    AsyncManager.OutstandingOperations.Increment();
    process.Start();
}

[CacheFilter]
public ActionResult Mp3Completed(string fileName) 
{
    return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
}

Answer 1

服务器上的I / O非常昂贵。你认为可以在服务器硬盘上获得多少个wav写入流？为什么不在内存中完成所有操作，只有在完全处理后才能编写mp3？ mp3的数量要小得多，而且I / O会占用很少的时间。您甚至可以更改代码以将流直接返回给用户，而不是保存到mp3。如果需要。

How do can I use LAME to encode an wav to an mp3 c#

Answer 2

这个问题现在有点老了，但这就是我正在做的事情，到目前为止它一直很好用：

    public Task<FileStreamResult> Speak(string text)
    {
        return Task.Factory.StartNew(() =>
        {
            using (var synthesizer = new SpeechSynthesizer())
            {
                var ms = new MemoryStream();
                synthesizer.SetOutputToWaveStream(ms);
                synthesizer.Speak(text);

                ms.Position = 0;
                return new FileStreamResult(ms, "audio/wav");
            }
        });
    }

可能对某人有所帮助......

ASP.NET MVC中的超快文本到语音（WAV - ＆gt; MP3）

2 个答案: