我该如何开发图像识别程序

时间:2017-07-18 10:32:38

标签: c++ ffmpeg mfc

我的节目将识别每个电视节目之间的广告。 但我不知道如何识别广告。 我记得声音识别,但它很难。 我正在使用FFmpeg Library。 有VideoState结构参考。

typedef struct VideoState {
SDL_Thread *read_tid;
SDL_Thread *video_tid;
SDL_Thread *refresh_tid;
AVInputFormat *iformat;
int no_background;
int abort_request;
int force_refresh;
int paused;
int last_paused;
int que_attachments_req;
int seek_req;
int seek_flags;
int64_t seek_pos;
int64_t seek_rel;
int read_pause_return;
AVFormatContext *ic;

int audio_stream;

int av_sync_type;
double external_clock; /* external clock base */
int64_t external_clock_time;

double audio_clock;
double audio_diff_cum; /* used for AV difference average computation */
double audio_diff_avg_coef;
double audio_diff_threshold;
int audio_diff_avg_count;
AVStream *audio_st;
PacketQueue audioq;
int audio_hw_buf_size;
DECLARE_ALIGNED(16,uint8_t,audio_buf2)[AVCODEC_MAX_AUDIO_FRAME_SIZE * 4];
uint8_t silence_buf[SDL_AUDIO_BUFFER_SIZE];
uint8_t *audio_buf;
uint8_t *audio_buf1;
unsigned int audio_buf_size; /* in bytes */
int audio_buf_index; /* in bytes */
int audio_write_buf_size;
AVPacket audio_pkt_temp;
AVPacket audio_pkt;
struct AudioParams audio_src;
struct AudioParams audio_tgt;
struct SwrContext *swr_ctx;
double audio_current_pts;
double audio_current_pts_drift;
int frame_drops_early;
int frame_drops_late;
AVFrame *frame;

enum ShowMode {
    SHOW_MODE_NONE = -1, SHOW_MODE_VIDEO = 0, SHOW_MODE_WAVES, 
SHOW_MODE_RDFT, SHOW_MODE_NB
} show_mode;
int16_t sample_array[SAMPLE_ARRAY_SIZE];
int sample_array_index;
int last_i_start;
RDFTContext *rdft;
int rdft_bits;
FFTSample *rdft_data;
int xpos;

SDL_Thread *subtitle_tid;
int subtitle_stream;
int subtitle_stream_changed;
AVStream *subtitle_st;
PacketQueue subtitleq;
SubPicture subpq[SUBPICTURE_QUEUE_SIZE];
int subpq_size, subpq_rindex, subpq_windex;
SDL_mutex *subpq_mutex;
SDL_cond *subpq_cond;

double frame_timer;
double frame_last_pts;
double frame_last_duration;
double frame_last_dropped_pts;
double frame_last_returned_time;
double frame_last_filter_delay;
int64_t frame_last_dropped_pos;
double video_clock;                          ///< pts of last decoded frame 
/ predicted pts of next decoded frame
int video_stream;
AVStream *video_st;
PacketQueue videoq;
double video_current_pts;                    ///< current displayed pts 
(different from video_clock if frame fifos are used)
double video_current_pts_drift;              ///< video_current_pts - time 
(av_gettime) at which we updated video_current_pts - used to have running 
video pts
int64_t video_current_pos;                   ///< current displayed file pos
VideoPicture pictq[VIDEO_PICTURE_QUEUE_SIZE];
int pictq_size, pictq_rindex, pictq_windex;
SDL_mutex *pictq_mutex;
SDL_cond *pictq_cond;
#if !CONFIG_AVFILTER
struct SwsContext *img_convert_ctx;
#endif

char filename[1024];
int width, height, xleft, ytop;
int step;

#if CONFIG_AVFILTER
AVFilterContext *in_video_filter;           ///< the first filter in the 
video chain
AVFilterContext *out_video_filter;          ///< the last filter in the 
video chain
int use_dr1;
FrameBuffer *buffer_pool;
#endif

int refresh;
int last_video_stream, last_audio_stream, last_subtitle_stream;

SDL_cond *continue_read_thread;

enum V_Show_Mode v_show_mode;
} VideoState;

我的程序可以使用什么......我真的需要你的帮助..谢谢!!!

1 个答案:

答案 0 :(得分:1)

使用正确的图像处理/计算机视觉库:

FFmpeg库用于更改视频和音频的属性,如(fps,比特率,采样率,编解码器等),但对于您在此处提到的任务,您可以使用一些专用的图像处理或计算机视觉库

  • 您可以选择的选项很少,例如Matlab的图像处理库,基于Java的(ImageJ和OpenIMAJ)和Opencv(在C / C ++和python中)。
  • 最适合您的选择是OpenCV,因为它是一个完整的计算机视觉库,它具有可随时使用的内置功能,您可以轻松使用它们而无需了解所有这些算法的背景细节。
  • 在OpenCV中有视频和视频分析库,它们对您有所帮助,OpenCV也是用C ++编写的,您不必担心这些算法的优化,因为它们已经过优化,并且你将处理连续的视频流,你不希望你的程序在处理过程中变慢。

我希望这对你有所帮助。