FFmpeg + OpenAL - 从视频播放流式传输声音不起作用

时间:2014-01-27 16:28:37

标签: c++ ffmpeg openal ogg vorbis

我正在解码OGG视频(theora& vorbis作为编解码器)并希望在播放声音时在屏幕上显示它(使用Ogre 3D)。我可以很好地解码图像流,并且视频可以正确地以正确的帧速率播放等。

然而,我无法使用OpenAL来播放声音。

编辑:我设法让播放声音至少与视频中的实际音频相似。更新了示例代码。

编辑2:我现在能够获得“几乎”正确的声音。我必须将OpenAL设置为使用AL_FORMAT_STEREO_FLOAT32(初始化扩展后)而不是仅使用STEREO16。现在声音“只有”极高音调和口吃,但速度正确。

以下是我对音频数据包进行解码的方法(在后台线程中,相当于视频文件图像流的效果):

//------------------------------------------------------------------------------
int decodeAudioPacket(  AVPacket& p_packet, AVCodecContext* p_audioCodecContext, AVFrame* p_frame,
                        FFmpegVideoPlayer* p_player, VideoInfo& p_videoInfo)
{
    // Decode audio frame
    int got_frame = 0;
    int decoded = avcodec_decode_audio4(p_audioCodecContext, p_frame, &got_frame, &p_packet);
    if (decoded < 0) 
    {
        p_videoInfo.error = "Error decoding audio frame.";
        return decoded;
    }

    // Frame is complete, store it in audio frame queue
    if (got_frame)
    {
        int bufferSize = av_samples_get_buffer_size(NULL, p_audioCodecContext->channels, p_frame->nb_samples, 
                                                    p_audioCodecContext->sample_fmt, 0);

        int64_t duration = p_frame->pkt_duration;
        int64_t dts = p_frame->pkt_dts;

        if (staticOgreLog)
        {
            staticOgreLog->logMessage("Audio frame bufferSize / duration / dts: " 
                    + boost::lexical_cast<std::string>(bufferSize) + " / "
                    + boost::lexical_cast<std::string>(duration) + " / "
                    + boost::lexical_cast<std::string>(dts), Ogre::LML_NORMAL);
        }

        // Create the audio frame
        AudioFrame* frame = new AudioFrame();
        frame->dataSize = bufferSize;
        frame->data = new uint8_t[bufferSize];
        if (p_frame->channels == 2)
        {
            memcpy(frame->data, p_frame->data[0], bufferSize >> 1);
            memcpy(frame->data + (bufferSize >> 1), p_frame->data[1], bufferSize >> 1);
        }
        else
        {
            memcpy(frame->data, p_frame->data, bufferSize);
        }
        double timeBase = ((double)p_audioCodecContext->time_base.num) / (double)p_audioCodecContext->time_base.den;
        frame->lifeTime = duration * timeBase;

        p_player->addAudioFrame(frame);
    }

    return decoded;
}

所以,正如你所看到的,我解码了帧,将它存储到我自己的struct,AudioFrame。现在,当播放声音时,我使用这样的音频帧:

    int numBuffers = 4;
    ALuint buffers[4];
    alGenBuffers(numBuffers, buffers);
    ALenum success = alGetError();
    if(success != AL_NO_ERROR)
    {
        CONSOLE_LOG("Error on alGenBuffers : " + Ogre::StringConverter::toString(success) + alGetString(success));
        return;
    }

    // Fill a number of data buffers with audio from the stream
    std::vector<AudioFrame*> audioBuffers;
    std::vector<unsigned int> audioBufferSizes;
    unsigned int numReturned = FFMPEG_PLAYER->getDecodedAudioFrames(numBuffers, audioBuffers, audioBufferSizes);

    // Assign the data buffers to the OpenAL buffers
    for (unsigned int i = 0; i < numReturned; ++i)
    {
        alBufferData(buffers[i], _streamingFormat, audioBuffers[i]->data, audioBufferSizes[i], _streamingFrequency);

        success = alGetError();
        if(success != AL_NO_ERROR)
        {
            CONSOLE_LOG("Error on alBufferData : " + Ogre::StringConverter::toString(success) + alGetString(success)
                            + " size: " + Ogre::StringConverter::toString(audioBufferSizes[i]));
            return;
        }
    }

    // Queue the buffers into OpenAL
    alSourceQueueBuffers(_source, numReturned, buffers);
    success = alGetError();
    if(success != AL_NO_ERROR)
    {
        CONSOLE_LOG("Error queuing streaming buffers: " + Ogre::StringConverter::toString(success) + alGetString(success));
        return;
    }
}

alSourcePlay(_source);

我给OpenAL的格式和频率是AL_FORMAT_STEREO_FLOAT32(它是立体声声音流,我确实初始化了FLOAT32扩展)和48000(这是音频流的AVCodecContext的采样率)。

在播放过程中,我会执行以下操作来重新填充OpenAL的缓冲区:

ALint numBuffersProcessed;

// Check if OpenAL is done with any of the queued buffers
alGetSourcei(_source, AL_BUFFERS_PROCESSED, &numBuffersProcessed);
if(numBuffersProcessed <= 0)
    return;

// Fill a number of data buffers with audio from the stream
std::vector<AudiFrame*> audioBuffers;
std::vector<unsigned int> audioBufferSizes;
unsigned int numFilled = FFMPEG_PLAYER->getDecodedAudioFrames(numBuffersProcessed, audioBuffers, audioBufferSizes);

// Assign the data buffers to the OpenAL buffers
ALuint buffer;
for (unsigned int i = 0; i < numFilled; ++i)
{
    // Pop the oldest queued buffer from the source, 
    // fill it with the new data, then re-queue it
    alSourceUnqueueBuffers(_source, 1, &buffer);

    ALenum success = alGetError();
    if(success != AL_NO_ERROR)
    {
        CONSOLE_LOG("Error Unqueuing streaming buffers: " + Ogre::StringConverter::toString(success));
        return;
    }

    alBufferData(buffer, _streamingFormat, audioBuffers[i]->data, audioBufferSizes[i], _streamingFrequency);

    success = alGetError();
    if(success != AL_NO_ERROR)
    {
        CONSOLE_LOG("Error on re- alBufferData: " + Ogre::StringConverter::toString(success));
        return;
    }

    alSourceQueueBuffers(_source, 1, &buffer);

    success = alGetError();
    if(success != AL_NO_ERROR)
    {
        CONSOLE_LOG("Error re-queuing streaming buffers: " + Ogre::StringConverter::toString(success) + " "
                    + alGetString(success));
        return;
    }
}

// Make sure the source is still playing, 
// and restart it if needed.
ALint playStatus;
alGetSourcei(_source, AL_SOURCE_STATE, &playStatus);
if(playStatus != AL_PLAYING)
    alSourcePlay(_source);

正如您所看到的,我做了很多错误检查。但我不会从OpenAL或FFmpeg那里得到任何错误。 编辑:我听到的有点类似于视频中的实际音频,但非常高音调和口吃很多。此外,它似乎是在电视噪音的顶部播放。很奇怪。另外,它的播放速度比正确的音频慢得多。 编辑:2 使用AL_FORMAT_STEREO_FLOAT32后,声音以正确的速度播放,但仍然是非常高的音调和口吃(虽然比之前少)。

视频本身没有损坏,可以在任何播放器上播放。

,OpenAL也可以在同一个应用程序中播放* .way文件,因此也可以。

任何想法在这里可能是错误的或如何正确地做到这一点?

我唯一的猜测是,不知何故,FFmpeg的解码函数不会产生OpenGL可以读取的数据。但这就是FFmpeg解码示例,所以我不知道缺少什么。据我了解,decode_audio4函数将帧解码为原始数据。 OpenAL应该能够处理RAW数据(或者更确切地说,不能与其他任何数据一起使用)。

1 个答案:

答案 0 :(得分:2)

所以,我终于想出了怎么做。哎呀,真是一团糟。来自libav-users邮件列表上的用户的hint让我走上了正确的道路。

以下是我的错误:

  1. 在alBufferData函数中使用错误的格式。我使用了AL_FORMAT_STEREO16(因为这是OpenAL使用的每个流媒体示例)。我应该使用AL_FORMAT_STEREO_FLOAT32,因为视频I流是Ogg而vorbis存储在浮点中。使用swr_convert从AV_SAMPLE_FMT_FLTP转换为AV_SAMPLE_FMT_S16只会崩溃。不知道为什么。

  2. 不使用swr_convert将解码后的音频帧转换为目标格式。在我尝试使用swr_convert从FLTP转换为S16之后,它会在没有给出理由的情况下崩溃,我以为它被打破了。但在弄清楚我的第一个错误之后,我再次尝试,从FLTP转换为FLT(非平面)然后它工作了!因此OpenAL使用交错格式,而不是平面格式。很高兴知道。

  3. 所以这里有一个decodeAudioPacket函数,它适用于Ogg视频,vorbis音频流:

    int decodeAudioPacket(  AVPacket& p_packet, AVCodecContext* p_audioCodecContext, AVFrame* p_frame,
                            SwrContext* p_swrContext, uint8_t** p_destBuffer, int p_destLinesize,
                            FFmpegVideoPlayer* p_player, VideoInfo& p_videoInfo)
    {
        // Decode audio frame
        int got_frame = 0;
        int decoded = avcodec_decode_audio4(p_audioCodecContext, p_frame, &got_frame, &p_packet);
        if (decoded < 0) 
        {
            p_videoInfo.error = "Error decoding audio frame.";
            return decoded;
        }
    
        if(decoded <= p_packet.size)
        {
            /* Move the unread data to the front and clear the end bits */
            int remaining = p_packet.size - decoded;
            memmove(p_packet.data, &p_packet.data[decoded], remaining);
            av_shrink_packet(&p_packet, remaining);
        }
    
        // Frame is complete, store it in audio frame queue
        if (got_frame)
        {
            int outputSamples = swr_convert(p_swrContext, 
                                            p_destBuffer, p_destLinesize, 
                                            (const uint8_t**)p_frame->extended_data, p_frame->nb_samples);
    
            int bufferSize = av_get_bytes_per_sample(AV_SAMPLE_FMT_FLT) * p_videoInfo.audioNumChannels
                                * outputSamples;
    
            int64_t duration = p_frame->pkt_duration;
            int64_t dts = p_frame->pkt_dts;
    
            if (staticOgreLog)
            {
                staticOgreLog->logMessage("Audio frame bufferSize / duration / dts: " 
                        + boost::lexical_cast<std::string>(bufferSize) + " / "
                        + boost::lexical_cast<std::string>(duration) + " / "
                        + boost::lexical_cast<std::string>(dts), Ogre::LML_NORMAL);
            }
    
            // Create the audio frame
            AudioFrame* frame = new AudioFrame();
            frame->dataSize = bufferSize;
            frame->data = new uint8_t[bufferSize];
            memcpy(frame->data, p_destBuffer[0], bufferSize);
            double timeBase = ((double)p_audioCodecContext->time_base.num) / (double)p_audioCodecContext->time_base.den;
            frame->lifeTime = duration * timeBase;
    
            p_player->addAudioFrame(frame);
        }
    
        return decoded;
    }
    

    以下是我初始化上下文和目标缓冲区的方法:

    // Initialize SWR context
    SwrContext* swrContext = swr_alloc_set_opts(NULL, 
                audioCodecContext->channel_layout, AV_SAMPLE_FMT_FLT, audioCodecContext->sample_rate,
                audioCodecContext->channel_layout, audioCodecContext->sample_fmt, audioCodecContext->sample_rate, 
                0, NULL);
    int result = swr_init(swrContext);
    
    // Create destination sample buffer
    uint8_t** destBuffer = NULL;
    int destBufferLinesize;
    av_samples_alloc_array_and_samples( &destBuffer,
                                        &destBufferLinesize,
                                        videoInfo.audioNumChannels,
                                        2048,
                                        AV_SAMPLE_FMT_FLT,
                                        0);