我正在编写用于将实时音频和视频从webcamera流式传输到rtmp-server的程序。我在MacOS X 10.8中工作,因此我使用AVFoundation框架从输入设备获取音频和视频帧。这个框架进入代表:
-(void) captureOutput:(AVCaptureOutput*)captureOutput didOutputSampleBuffer: (CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection*)connection
,
其中sampleBuffer
包含音频或视频数据。
当我收到sampleBuffer
中的音频数据时,我正在尝试将此数据转换为AVFrame
并使用libavcodec编码AVFrame
:
aframe = avcodec_alloc_frame(); //AVFrame *aframe;
int got_packet, ret;
CMItemCount numSamples = CMSampleBufferGetNumSamples(sampleBuffer); //CMSampleBufferRef
NSUInteger channelIndex = 0;
CMBlockBufferRef audioBlockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t audioBlockBufferOffset = (channelIndex * numSamples * sizeof(SInt16));
size_t lengthAtOffset = 0;
size_t totalLength = 0;
SInt16 *samples = NULL;
CMBlockBufferGetDataPointer(audioBlockBuffer, audioBlockBufferOffset, &lengthAtOffset, &totalLength, (char **)(&samples));
const AudioStreamBasicDescription *audioDescription = CMAudioFormatDescriptionGetStreamBasicDescription(CMSampleBufferGetFormatDescription(sampleBuffer));
aframe->nb_samples =(int) numSamples;
aframe->channels=audioDescription->mChannelsPerFrame;
aframe->sample_rate=(int)audioDescription->mSampleRate;
//my webCamera configured to produce 16bit 16kHz LPCM mono, so sample format hardcoded here, and seems to be correct
avcodec_fill_audio_frame(aframe, aframe->channels, AV_SAMPLE_FMT_S16,
(uint8_t *)samples,
aframe->nb_samples *
av_get_bytes_per_sample(AV_SAMPLE_FMT_S16) *
aframe->channels, 0);
//encoding audio
ret = avcodec_encode_audio2(c, &pkt, aframe, &got_packet);
if (ret < 0) {
fprintf(stderr, "Error encoding audio frame: %s\n", av_err2str(ret));
exit(1);
}
aframe = avcodec_alloc_frame(); //AVFrame *aframe;
int got_packet, ret;
CMItemCount numSamples = CMSampleBufferGetNumSamples(sampleBuffer); //CMSampleBufferRef
NSUInteger channelIndex = 0;
CMBlockBufferRef audioBlockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t audioBlockBufferOffset = (channelIndex * numSamples * sizeof(SInt16));
size_t lengthAtOffset = 0;
size_t totalLength = 0;
SInt16 *samples = NULL;
CMBlockBufferGetDataPointer(audioBlockBuffer, audioBlockBufferOffset, &lengthAtOffset, &totalLength, (char **)(&samples));
const AudioStreamBasicDescription *audioDescription = CMAudioFormatDescriptionGetStreamBasicDescription(CMSampleBufferGetFormatDescription(sampleBuffer));
aframe->nb_samples =(int) numSamples;
aframe->channels=audioDescription->mChannelsPerFrame;
aframe->sample_rate=(int)audioDescription->mSampleRate;
//my webCamera configured to produce 16bit 16kHz LPCM mono, so sample format hardcoded here, and seems to be correct
avcodec_fill_audio_frame(aframe, aframe->channels, AV_SAMPLE_FMT_S16,
(uint8_t *)samples,
aframe->nb_samples *
av_get_bytes_per_sample(AV_SAMPLE_FMT_S16) *
aframe->channels, 0);
//encoding audio
ret = avcodec_encode_audio2(c, &pkt, aframe, &got_packet);
if (ret < 0) {
fprintf(stderr, "Error encoding audio frame: %s\n", av_err2str(ret));
exit(1);
}
问题在于,当我得到如此形成的帧时,我可以听到想要的声音,但它正在减速和不连续(好像在每个数据帧出现相同的静音帧之后)。从到
CMSampleBuffer
的转换似乎出现了问题,因为使用AVFoundation从同一样本缓冲区创建的麦克风预览正常播放。
我很感激你的帮助。
UPD:创建并初始化AVCodceContext结构
AVFrame
audio_codec= avcodec_find_encoder(AV_CODEC_ID_AAC);
if (!(audio_codec)) {
fprintf(stderr, "Could not find encoder for '%s'\n",
avcodec_get_name(AV_CODEC_ID_AAC));
exit(1);
}
audio_st = avformat_new_stream(oc, audio_codec); //AVFormatContext *oc;
if (!audio_st) {
fprintf(stderr, "Could not allocate stream\n");
exit(1);
}
audio_st->id=1;
audio_st->codec->sample_fmt= AV_SAMPLE_FMT_S16;
audio_st->codec->bit_rate = 64000;
audio_st->codec->sample_rate= 16000;
audio_st->codec->channels=1;
audio_st->codec->codec_type= AVMEDIA_TYPE_AUDIO;