从歌词创建图像序列以在ffmpeg中使用

时间:2018-09-16 19:02:39

标签: python ffmpeg

我正在尝试用python制作MP3 +歌词-> MP4程序。

我有一个这样的歌词文件:

[00:00.60]Revelation, chapter 4
[00:02.34]After these things I looked, 
[00:04.10]and behold a door was opened in heaven, 
[00:06.41]and the first voice which I heard, as it were, 
[00:08.78]of a trumpet speaking with me, said: 
[00:11.09]Come up hither, 
[00:12.16]and I will shew thee the things which must be done hereafter.
[00:15.78]And immediately I was in the spirit: 
[00:18.03]and behold there was a throne set in heaven, 
[00:20.72]and upon the throne one sitting.
[00:22.85]And he that sat, 
[00:23.91]was to the sight like the jasper and the sardine stone; 
[00:26.97]and there was a rainbow round about the throne, 
[00:29.16]in sight like unto an emerald.
[00:31.35]And round about the throne were four and twenty seats; 
[00:34.85]and upon the seats, four and twenty ancients sitting, 
[00:38.03]clothed in white garments, and on their heads were crowns of gold.
[00:41.97]And from the throne proceeded lightnings, and voices, and thunders; 
[00:46.03]and there were seven lamps burning before the throne, 
[00:48.60]which are the seven spirits of God. 
[00:51.23]And in the sight of the throne was, as it were, 
[00:53.79]a sea of glass like to crystal; 
[00:56.16]and in the midst of the throne, and round about the throne, 
[00:59.29]were four living creatures, full of eyes before and behind.
[01:03.79]And the first living creature was like a lion: 

我正在尝试根据歌词创建一系列图像,以用于ffmpeg。

os.system(ffmpeg_path + " -r 2 -i " + images_path + "image%1d.png -i " + audio_file + " -vcodec mpeg4 -y " + video_name)

我尝试找出每行要制作的图像数量。我试过从当前行中减去下一行的秒数。它可以工作,但结果却不一致。

import os
import datetime
import time
import math
from PIL import Image, ImageDraw


ffmpeg_path = os.getcwd() + "\\ffmpeg\\bin\\ffmpeg.exe"
images_path = os.getcwd() + "\\test_output\\"
audio_file = os.getcwd() + "\\audio.mp3"
lyric_file = os.getcwd() + "\\lyric.lrc"

video_name = "movie.mp4"


def save():

    lyric_to_images()
    os.system(ffmpeg_path + " -r 2 -i " + images_path + "image%1d.png -i " + audio_file + " -vcodec mpeg4 -y " + video_name)


def lyric_to_images():

    file  = open(lyric_file, "r")

    data = file.readlines()

    startOfLyric = True
    lstTimestamp = []

    images_to_make = 0
    from_second = 0.0
    to_second = 0.0

    for line in data:
        vTime = line[1:9] # 00:00.60

        temp = vTime.split(':')

        minute = float(temp[0])
        #a = float(temp[1].split('.'))
        #second = float((minute * 60) + int(a[0]))
        second = (minute * 60) + float(temp[1])

        lstTimestamp.append(second)

    counter = 1

    for i, second in enumerate(lstTimestamp):

        if startOfLyric is True:
            startOfLyric = False
            #first line is always 3 seconds (images to make = 3x2)
            for x in range(1, 7):
                writeImage(data[i][10:], 'image' + str(counter))
                counter += 1
        else:
            from_second = lstTimestamp[i-1]
            to_second = second

            difference = to_second - from_second
            images_to_make = int(difference * 2)

            for x in range(1, int(images_to_make+1)):
                writeImage(data[i-1][10:], 'image'+str(counter))
                counter += 1

    file.close()

def writeImage(v_text, filename):

    img = Image.new('RGB', (480, 320), color = (73, 109, 137))

    d = ImageDraw.Draw(img)
    d.text((10,10), v_text, fill=(255,255,0))

    img.save(os.getcwd() + "\\test_output\\" + filename + ".png")


save()

有没有一种有效且准确的方法来计算我需要为每行创建多少个图像?

注意:由于我将-r 2用于FFmpeg(2 FPS),因此我创建的许多图像都必须乘以2。

2 个答案:

答案 0 :(得分:1)

subtitles过滤器中使用字幕。这比预先制作图像并尝试定时计时要容易和有效。您还可以控制字体,大小,颜色,样式,位置等。使用color过滤器作为背景的示例:

enter image description here

ffmpeg -i music.mp3 -filter_complex "color=c=blue,subtitles=lyrics.srt[v]" -map "[v]" -map 0:a -c:a aac -shortest output.mp4

SRT

这是一种支持基本样式的简单格式。

1
00:00:00,600 --> 00:00:02,340
Revelation, chapter 4

2
00:00:02,340 --> 00:00:04,100
<b>After</b> these <u>things</u> I <font color="green">looked</font>,

3
00:00:04,100 --> 00:00:06,410
and behold a door was opened in heaven,

ASS

使用ASS字幕,您可以获得更多控制权,例如单个单词和字母的样式,但是这种格式要复杂得多:

[Script Info]
ScriptType: v4.00+
PlayResX: 384
PlayResY: 288

[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Default,Arial,16,&Hffffff,&Hffffff,&H0,&H0,0,0,0,0,100,100,0,0,1,1,0,2,10,10,10,0

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.60,0:00:02.34,Default,,0,0,0,,Revelation, chapter 4
Dialogue: 0,0:00:02.34,0:00:04.10,Default,,0,0,0,,After these things I looked,
Dialogue: 0,0:00:04.10,0:00:06.41,Default,,0,0,0,,and behold a door was opened in heaven,

此示例仅显示格式结构:我未添加任何样式。如果您想尝试使用这种格式,可以使用Aegisub创建ASS字幕。 ffmpeg可以转换字幕格式。

force_style选项

字幕过滤器中的force_style选项可以扩展简单SRT格式的格式化可能性。它使用ASS格式选项,例如FontsizeFontnameOutlineColour等。请查看上面ASS示例中的Format行以获取选项列表。

subtitles=lyrics.srt:force_style='Fontname=DejaVu Serif,PrimaryColour=&HCCFF0000'

答案 1 :(得分:0)

好的代码。改进效果最小的更改是根据文件中的当前时间位置来计算from_second,如下所示:

if----