合并2个字幕块时遇到问题

时间:2017-12-17 19:55:32

标签: python python-3.x merge subtitle

我尝试合并2个字幕块,以便更轻松地使用deepl。虽然句子可以合并并且结束时间改变,但是我在更改索引号时遇到了麻烦。 count变量递增但从未从索引中减去。

例如,如果我们有这个字幕块:

5
00:00:23,315 --> 00:00:25,108
A streetwise but soulful
teen needed somewhere to live

6
00:00:25,192 --> 00:00:26,610
as he waited for his Juilliard audition.

7
00:00:26,693 --> 00:00:29,488
We'd support his dancing and let
him stay in the guest room, right.

5和6将合并。结束时间将是6.那个工作很好,除非它合并了它我应该得到5和6的索引但是我得到5和7.

我想要制作的例子:

5
00:00:23,315 --> 00:00:26,610
A streetwise but soulful
teen needed somewhere to live
as he waited for his Juilliard audition.

6
00:00:26,693 --> 00:00:29,488
We'd support his dancing and let
him stay in the guest room, right.

这是我的代码。我尝试添加2个地方,尝试了subs[sub.index].index = subs[sub.index] - count,但没有一个有效。

import pysrt
import os

count = 0

# Init pysrt
subs = pysrt.open(" Bojack Horseman36.srt")
# Go through each subtitle
for sub in subs:
    try:
        # Check if it's a sentence if not check if there is another sentence there if not nothing just remove index
        sentence = None
        if subs[sub.index].text.endswith('.') or subs[sub.index].text.endswith('?') or subs[sub.index].text.endswith('!'):
            subs[sub.index].index - count
        else:
            subs[sub.index].text = subs[sub.index].text + '\n' + subs[sub.index+1].text
            count+=1
            subs[sub.index].index - count
            subs[sub.index].end = subs[sub.index+1].end
            del subs[sub.index+1]
    except IndexError:      
        pass

subs.save('translatedsubs.srt', encoding='utf-8')

感谢任何帮助:D

1 个答案:

答案 0 :(得分:2)

以下内容可以帮助您入门:

   0,0      0,1      0,2
 /     \  /     \  /     \
|   3   ||       ||   1   |
 \     /  \     /  \     /
      1,0      1,1      1,2
    /     \  /     \  /     \
   |   1   ||   a   ||   2   |
    \     /  \     /  \     /
   2,0      2,1      2,2
 /     \  /     \  /     \
|   2   ||   1   ||       |
 \     /  \     /  \     /

如果连续需要多个连接,可能会遇到问题。

它产生以下输出:

import pysrt

subs = pysrt.open("test.srt")
append_index = None
remove_list = []                # List of unwanted indexes
sub_index = subs[0].index       # Existing starting index

for index, sub in enumerate(subs):
    if append_index is not None:
        subs[append_index].text += "\n" + sub.text
        subs[append_index].end = sub.end
        remove_list.append(index)
    if sub.text[-1] not in '.?!':
        append_index = index
    else:
        append_index = None

# Remove orphaned subs in reverse order        
for index in remove_list[::-1]:     
    del subs[index]

# Reindex remaining subs
for index in range(len(subs)):
    subs[index].index = index + sub_index

subs.save('test out.srt', encoding='utf-8')

注意,最好不要删除或添加正在迭代的列表中的项目。相反,我创建了一个要删除的索引列表。然后以相反的顺序删除不需要的索引,这样,要删除的其余项的索引保持不变。