ffmpeg - 合并mp3和mp4(持续时间差异)

时间:2017-12-06 20:18:12

标签: video ffmpeg mp3 mp4 concat

我正在尝试将mp4和mp3文件与ffmpeg合并。 mp4持续时间 - 9.800秒,mp3 - 58.540秒。所以我使用-shortest键。 代码:

ffmpeg -i video.mp4 -i audio.mp3 -c:v libx264 -c:a aac -strict experimental -shortest output.mp4

之后我得到了output.mp4,持续时间为9.846。我的错误在哪里?为什么输出视频的时间长于源? (9.846秒和9.800秒)。

来源mp4 MediaInfo:

General
Complete name                  : F:\video test\video.mp4
Format                         : MPEG-4
Format profile                 : Base Media
Codec ID                       : iso5 (iso5/dash)
File size                      : 3.19 MiB
Duration                       : 9 s 800 ms
Overall bit rate               : 2 732 kb/s
Encoded date                   : UTC 2017-11-24 20:53:53
Tagged date                    : UTC 2017-11-24 20:53:53

Video
ID                             : 1
Format                         : AVC
Format/Info                    : Advanced Video Codec
Format profile                 : High@L3.1
Format settings                : CABAC / 4 Ref Frames
Format settings, CABAC         : Yes
Format settings, ReFrames      : 4 frames
Codec ID                       : avc1
Codec ID/Info                  : Advanced Video Coding
Duration                       : 9 s 800 ms
Bit rate                       : 2 729 kb/s
Maximum bit rate               : 3 766 kb/s
Width                          : 1 280 pixels
Height                         : 720 pixels
Display aspect ratio           : 16:9
Frame rate mode                : Constant
Frame rate                     : 25.000 FPS
Color space                    : YUV
Chroma subsampling             : 4:2:0
Bit depth                      : 8 bits
Scan type                      : Progressive
Bits/(Pixel*Frame)             : 0.118
Stream size                    : 3.19 MiB (100%)
Writing library                : x264 core 146
Encoding settings              : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=7 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=12 / lookahead_threads=2 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=250 / keyint_min=25 / scenecut=40 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=1 / crf=23.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ip_ratio=1.40 / aq=1:1.00
Tagged date                    : UTC 2017-11-24 20:53:53

来源mp3 Mediainfo:

General
Complete name                  : F:\video test\audio.mp3
Format                         : MPEG Audio
File size                      : 1.19 MiB
Duration                       : 58 s 540 ms
Overall bit rate mode          : Variable
Overall bit rate               : 170 kb/s
Writing library                : LAME3.99r

Audio
Format                         : MPEG Audio
Format version                 : Version 1
Format profile                 : Layer 3
Format settings                : Joint stereo / MS Stereo
Duration                       : 58 s 540 ms
Bit rate mode                  : Variable
Bit rate                       : 170 kb/s
Minimum bit rate               : 32.0 kb/s
Channel(s)                     : 2 channels
Sampling rate                  : 44.1 kHz
Frame rate                     : 38.281 FPS (1152 SPF)
Compression mode               : Lossy
Stream size                    : 1.19 MiB (100%)
Writing library                : LAME3.99r
Encoding settings              : -m j -V 2 -q 0 -lowpass 18.5 --vbr-new -b 32

控制台输出:

ffmpeg version 3.4 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 7.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-cuda --enable-cuvid --enable-d3d11va --enable-nvenc --enable-dxva2 --enable-avisynth --enable-libmfx
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'video.mp4':
  Metadata:
    major_brand     : iso5
    minor_version   : 1
    compatible_brands: iso5dash
    creation_time   : 2017-11-24T20:53:53.000000Z
  Duration: 00:00:09.80, start: 0.000000, bitrate: 2732 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 2259 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    : VideoHandler
Input #1, mp3, from 'audio.mp3':
  Duration: 00:00:58.54, start: 0.025057, bitrate: 170 kb/s
    Stream #1:0: Audio: mp3, 44100 Hz, stereo, s16p, 170 kb/s
    Metadata:
      encoder         : LAME3.99r
    Side data:
      replaygain: track gain - -2.200000, track peak - unknown, album gain - unknown, album peak - unknown, 
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #1:0 -> #0:1 (mp3 (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 00000000005ab440] using SAR=1/1
[libx264 @ 00000000005ab440] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
[libx264 @ 00000000005ab440] profile High, level 3.1
[libx264 @ 00000000005ab440] 264 - core 152 r2851 ba24899 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.mp4':
  Metadata:
    major_brand     : iso5
    minor_version   : 1
    compatible_brands: iso5dash
    encoder         : Lavf57.83.100
    Stream #0:0(und): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p(progressive), 1280x720 [SAR 1:1 DAR 16:9], q=-1--1, 25 fps, 12800 tbn, 25 tbc (default)
    Metadata:
      handler_name    : VideoHandler
      encoder         : Lavc57.107.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc57.107.100 aac
    Side data:
      replaygain: track gain - -2.200000, track peak - unknown, album gain - unknown, album peak - unknown, 
frame=   54 fps=0.0 q=28.0 size=       0kB time=00:00:00.04 bitrate=   8.3kbits/s speed=0.0927x    
frame=   80 fps= 80 q=28.0 size=       0kB time=00:00:01.09 bitrate=   0.4kbits/s speed=1.09x    
frame=   98 fps= 65 q=28.0 size=     256kB time=00:00:01.83 bitrate=1143.5kbits/s speed=1.21x    
frame=  119 fps= 59 q=28.0 size=     512kB time=00:00:02.67 bitrate=1570.9kbits/s speed=1.32x    
frame=  144 fps= 56 q=28.0 size=     768kB time=00:00:03.66 bitrate=1715.0kbits/s speed=1.42x    
frame=  167 fps= 52 q=28.0 size=    1024kB time=00:00:04.57 bitrate=1833.9kbits/s speed=1.44x    
frame=  190 fps= 51 q=28.0 size=    1280kB time=00:00:05.50 bitrate=1905.5kbits/s speed=1.47x    
frame=  218 fps= 51 q=28.0 size=    1792kB time=00:00:06.64 bitrate=2210.6kbits/s speed=1.56x    
frame=  242 fps= 50 q=28.0 size=    2048kB time=00:00:07.56 bitrate=2216.4kbits/s speed=1.58x    
frame=  245 fps= 41 q=-1.0 Lsize=    3045kB time=00:00:09.82 bitrate=2539.6kbits/s speed=1.65x    
video:2880kB audio:156kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.298058%
[libx264 @ 00000000005ab440] frame I:14    Avg QP:20.01  size: 39750
[libx264 @ 00000000005ab440] frame P:106   Avg QP:23.85  size: 14578
[libx264 @ 00000000005ab440] frame B:125   Avg QP:24.63  size:  6770
[libx264 @ 00000000005ab440] consecutive B-frames: 22.9% 22.0% 15.9% 39.2%
[libx264 @ 00000000005ab440] mb I  I16..4: 16.7% 80.3%  3.0%
[libx264 @ 00000000005ab440] mb P  I16..4: 10.2% 36.2%  1.1%  P16..4: 25.0%  7.9%  2.5%  0.0%  0.0%    skip:17.1%
[libx264 @ 00000000005ab440] mb B  I16..4:  2.3%  5.8%  0.2%  B16..8: 31.4%  6.5%  0.9%  direct: 3.7%  skip:49.2%  L0:51.8% L1:44.5% BI: 3.7%
[libx264 @ 00000000005ab440] 8x8 transform intra:76.1% inter:86.3%
[libx264 @ 00000000005ab440] coded y,uvDC,uvAC intra: 38.3% 52.1% 9.0% inter: 12.3% 20.1% 0.2%
[libx264 @ 00000000005ab440] i16 v,h,dc,p: 30% 28%  9% 33%
[libx264 @ 00000000005ab440] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 36% 23% 19%  3%  3%  4%  4%  4%  4%
[libx264 @ 00000000005ab440] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 33% 21% 14%  5%  7%  7%  6%  5%  3%
[libx264 @ 00000000005ab440] i8c dc,h,v,p: 45% 24% 25%  6%
[libx264 @ 00000000005ab440] Weighted P-Frames: Y:13.2% UV:6.6%
[libx264 @ 00000000005ab440] ref P L0: 71.7% 12.5% 12.9%  2.7%  0.2%
[libx264 @ 00000000005ab440] ref B L0: 92.8%  6.3%  0.9%
[libx264 @ 00000000005ab440] ref B L1: 98.3%  1.7%
[libx264 @ 00000000005ab440] kb/s:2406.56
[aac @ 00000000005adde0] Qavg: 511.420

ffprobe -show_packets 输出太大,所以我加载到pastebin https://pastebin.com/TYSMdceS

1 个答案:

答案 0 :(得分:2)

快速回答您的问题是FFmpeg / libaac在开始时从-0.0213 s开始编码额外的aac启动数据包。这会增加你的持续时间。 如果能有所帮助,我会尽量详细解答。 您可以尝试ffprobe -show_packets output.mp4

我调查了你共享的数据包转储。 你的视频包看起来像

dts: -0.08 | pts: 0.0
dts: -0.04 | pts: 0.12
dts:  0.0  | pts: 0.04
dts:  0.04 | pts: 0.08
dts:  0.08 | pts: 0.24
...
dts:  9.64 | pts: 9.76
dts:  9.68 | pts: 9.72

来回pts值可能是因为你有I B B P阶的B帧。 您的视频流为25 fps,即1 frame duration = 0.04 s。 这会使您的视频9.76 + 0.04(frame duration) = 9.8 s

您的原始音频大于视频,因此会被截断以使最后一个数据包最多为9.80 s or later。 您的音频数据包看起来像

pts: -0.023220 (AAC priming data)
pts:  0.0
pts:  0.023220
...
pts:  9.775601 | duration: 0.023220
pts:  9.798821 | duration: 0.023175

您最后的音频数据包必须在9.80或之后结束。这就是为什么9.79的数据包被接受的原因。 因此,音频复制到AV流的持续时间是 0.02322 (primiing pkt) + 9.798821 + 0.023175 (dur) = 9.845216

我不确定额外的0.001秒来自哪里。其他人应该能够发表评论。我在开头看到了跳过的数据。

[SIDE_DATA]
side_data_type=Skip Samples
skip_samples=1024
discard_padding=0
skip_reason=0
discard_reason=0
[/SIDE_DATA]

我希望这会有所帮助。