您好,我一直在寻找在Linux(最好是Ubuntu)系统上播放和录制音频的方法。我目前正在开发一个voice recognition toolkit的前端,它会自动执行调整
PocketSphinx
和Julius
语音模型所需的几个步骤。
有关音频输入/输出的替代方法的建议欢迎,以及对错误的修复。
以下是我目前用来播放.WAV
文件的当前代码:
void Engine::sayText ( const string OutputText ) {
string audioUri = "temp.wav";
string requestUri = this->getRequestUri( OPENMARY_PROCESS , OutputText.c_str( ) );
int error , audioStream;
pa_simple *pulseConnection;
pa_sample_spec simpleSpecs;
simpleSpecs.format = PA_SAMPLE_S16LE;
simpleSpecs.rate = 44100;
simpleSpecs.channels = 2;
eprintf( E_MESSAGE , "Generating audio for '%s' from '%s'..." , OutputText.c_str( ) , requestUri.c_str( ) );
FILE* audio = this->getHttpFile( requestUri , audioUri );
fclose(audio);
eprintf( E_MESSAGE , "Generated audio.");
if ( ( audioStream = open( audioUri.c_str( ) , O_RDONLY ) ) < 0 ) {
fprintf( stderr , __FILE__": open() failed: %s\n" , strerror( errno ) );
goto finish;
}
if ( dup2( audioStream , STDIN_FILENO ) < 0 ) {
fprintf( stderr , __FILE__": dup2() failed: %s\n" , strerror( errno ) );
goto finish;
}
close( audioStream );
pulseConnection = pa_simple_new( NULL , "AudioPush" , PA_STREAM_PLAYBACK , NULL , "openMary C++" , &simpleSpecs , NULL , NULL , &error );
for (int i = 0;;i++ ) {
const int bufferSize = 1024;
uint8_t audioBuffer[bufferSize];
ssize_t r;
eprintf( E_MESSAGE , "Buffering %d..",i);
/* Read some data ... */
if ( ( r = read( STDIN_FILENO , audioBuffer , sizeof (audioBuffer ) ) ) <= 0 ) {
if ( r == 0 ) /* EOF */
break;
eprintf( E_ERROR , __FILE__": read() failed: %s\n" , strerror( errno ) );
if ( pulseConnection )
pa_simple_free( pulseConnection );
}
/* ... and play it */
if ( pa_simple_write( pulseConnection , audioBuffer , ( size_t ) r , &error ) < 0 ) {
fprintf( stderr , __FILE__": pa_simple_write() failed: %s\n" , pa_strerror( error ) );
if ( pulseConnection )
pa_simple_free( pulseConnection );
}
usleep(2);
}
/* Make sure that every single sample was played */
if ( pa_simple_drain( pulseConnection , &error ) < 0 ) {
fprintf( stderr , __FILE__": pa_simple_drain() failed: %s\n" , pa_strerror( error ) );
if ( pulseConnection )
pa_simple_free( pulseConnection );
}
}
注意:如果您希望将其余代码添加到此文件中,可以直接从Launchpad下载here。
更新:我尝试使用GStreamermm
,但这不起作用:
Glib::RefPtr<Pipeline> pipeline;
Glib::RefPtr<Element> sink, filter, source;
Glib::RefPtr<Gio::File> audioSrc = Gio::File::create_for_path(uri);
pipeline = Pipeline::create("audio-playback");
source = ElementFactory::create_element("alsasrc","source");
filter = ElementFactory::create_element("identity","filter");
sink = ElementFactory::create_element("alsasink","sink");
//sink->get_property("file",audioSrc);
if (!source || !filter || !sink){
showErrorDialog("Houston!","We got a problem.");
return;
}
pipeline->add(source)->add(filter)->add(sink);
source->link(sink);
pipeline->set_state(Gst::STATE_PLAYING);
showInformation("Close this to stop recording");
pipeline->set_state(Gst::STATE_PAUSED);
答案 0 :(得分:4)
The "Hello World" application in the GStreamer documentation显示了如何播放Ogg / Vorbis文件。要使用WAV文件,只需将“oggdemux”替换为“wavparse”,将“vorbisdec”替换为“identity”(身份插件不执行任何操作 - 它只是一个占位符)。
安装GStreamer的开发支持(在Ubuntu上)......
sudo apt-get install libgstreamer0.10-dev
您需要在gcc命令行上执行以下操作才能启用GStreamer库...
$(pkg-config --cflags --libs gstreamer-0.10)
顺便说一句,您可能会发现在编写代码之前使用“gst-launch”对GStreamer管道进行原型设计很有用。
## recording
gst-launch-0.10 autoaudiosrc ! wavenc ! filesink location=temp.wav
## playback
gst-launch-0.10 filesrc location=temp.wav ! wavparse ! autoaudiosink
可能对语音识别有用的GStreamer功能是可以轻松地将音频质量过滤器插入管道中 - 例如,您可以减少录音中可能存在的噪音。指向GStreamer“好”插件列表的指针是here。
同样有趣的是,“PocketSphinx”(似乎与您的项目有关)已经有了一些GStreamer集成。见Using PocketSphinx with GStreamer and Python
答案 1 :(得分:1)
GStreamer / Pulse / JACK很棒。 对于简单快速的事情,您可以使用SoX http://sox.sourceforge.net/