我有一个有趣的项目要做!我正在考虑将srt文件转换为csv / xls文件。
一个srt文件看起来像这样:
1
00:00:00,104 --> 00:00:02,669
Hi, I'm shell-scripting.
2
00:00:02,982 --> 00:00:04,965
I'm not sure if it would work,
but I'll try it!
3
00:00:05,085 --> 00:00:07,321
There must be a way to do it!
虽然我想将它输出到像这样的csv文件中:
"1","00:00:00,104","00:00:02,669","Hi, I'm shell-scripting."
"2","00:00:02,982","00:00:04,965","I'm not sure if it would work"
,,,"but I'll try it!"
"3","00:00:05,085","00:00:07,321","There must be a way to do it!"
正如您所看到的,每个字幕占用两行。我的想法是使用grep将srt数据放入xls,然后使用awk格式化xls文件。
你们觉得怎么样?我怎么想写呢?我试过了
$grep filename.srt > filename.xls
似乎包括时间码和字幕词在内的所有数据都在xls文件的A列中结束......但我希望这些词在B列中...... awk如何能够提供帮助格式化?
提前谢谢! :)
答案 0 :(得分:4)
$ cat tst.awk
BEGIN { RS=""; FS="\n"; OFS=","; q="\""; s=q OFS q }
{
split($2,a,/ .* /)
print q $1 s a[1] s a[2] s $3 q
for (i=4;i<=NF;i++) {
print "", "", "", q $i q
}
}
$ awk -f tst.awk file
"1","00:00:00,104","00:00:02,669","Hi, I'm shell-scripting."
"2","00:00:02,982","00:00:04,965","I'm not sure if it would work,"
,,,"but I'll try it!"
"3","00:00:05,085","00:00:07,321","There must be a way to do it!"
答案 1 :(得分:1)
我觉得这样的事情应该做得很好:
awk -v RS= -F'\n' '
{
sub(" --> ","\x7c",$2) # change "-->" to "|"
printf "%s|%s|%s\n",$1,$2,$3 # print scene, time start, time stop, description
for(i=4;i<=NF;i++)printf "|||%s\n",$i # print remaining lines of description
}' file.srt
-v RS=
将记录分隔符设置为空行。 -F'\n'
将字段分隔符设置为新行。
sub()
取代&#34; - &gt;&#34;使用管道符号(|
)。
然后用管道分开打印前三个字段,然后有一个小循环打印剩余的描述行,由三个管道符号插入以使它们对齐。
<强>输出强>
1|00:00:00,104|00:00:02,669|Hi, I'm shell-scripting.
2|00:00:02,982|00:00:04,965|I'm not sure if it would work,
|||but I'll try it!
3|00:00:05,085|00:00:07,321|There must be a way to do it!
由于我觉得在Perl和Excel上有更多的乐趣,我采用了上面的输出并在Perl中解析它并编写了一个真正的Excel XLSX文件。当然,没有必要使用awk
和Perl
,所以理想情况下,我会重新构建awk
并将其集成到Perl
,因为后者可以编写Excel文件,而前者不能。无论如何这里是Perl。
#!/usr/bin/perl
use strict;
use warnings;
use Excel::Writer::XLSX;
my $DEBUG=0;
my $workbook = Excel::Writer::XLSX->new('result.xlsx');
my $worksheet = $workbook->add_worksheet();
my $row=0;
while(my $line=<>){
$row++; # move down a line in Excel worksheet
chomp $line; # strip CR
my @f=split /\|/, $line; # split fields of line into array @f[], on pipe symbols (|)
for(my $j=0;$j<scalar @f;$j++){ # loop through all fields
my $cell= chr(65+$j) . $row; # calcuate Excell cell, starting at A1 (65="A")
$worksheet->write($cell,$f[$j]); # write to spreadsheet
printf "%s:%s ",$cell,$f[$j] if $DEBUG;
}
printf "\n" if $DEBUG;
}
$workbook->close;
<强>输出强>
答案 2 :(得分:1)
我的另一个答案是半awk和一半Perl,但是,鉴于awk
无法编写Excel电子表格,而Perl
可以,但要求您掌握{{1}似乎很愚蠢当awk
完全能够独立完成所有操作时,{}} Perl
所以这里有Perl:
Perl
将上述内容保存在名为#!/usr/bin/perl
use strict;
use warnings;
use Excel::Writer::XLSX;
my $workbook = Excel::Writer::XLSX->new('result.xlsx');
my $worksheet = $workbook->add_worksheet();
my $ExcelRow=0;
local $/ = ""; # set paragraph mode, so we read till next blank line as one record
while(my $para=<>){
$ExcelRow++; # move down a line in Excel worksheet
chomp $para; # strip CR
my @lines=split /\n/, $para; # split paragraph into lines on linefeed character
my $scene = $lines[0]; # pick up scene number from first line of para
my ($start,$end)=split / --> /,$lines[1]; # pick up start and end time from second line
my $cell=sprintf("A%d",$ExcelRow); # work out cell
$worksheet->write($cell,$scene); # write scene to spreadsheet column A
$cell=sprintf("B%d",$ExcelRow); # work out cell
$worksheet->write($cell,$start); # write start time to spreadsheet column B
$cell=sprintf("C%d",$ExcelRow); # work out cell
$worksheet->write($cell,$end); # write end time to spreadsheet column C
$cell=sprintf("D%d",$ExcelRow); # work out cell
$worksheet->write($cell,$lines[2]); # write description to spreadsheet column D
for(my $i=3;$i<scalar @lines;$i++){ # output additional lines of description
$ExcelRow++;
$cell=sprintf("D%d",$ExcelRow); # work out cell
$worksheet->write($cell,$lines[$i]);
}
}
$workbook->close;
的文件中,然后使用以下命令使其可执行:
srt2xls
然后你可以用
运行它chmod +x srt2xls
它将为您提供名为./srt2xls < SomeFileile.srt
答案 3 :(得分:0)
因为您想将srt转换为csv。下面是awk命令
awk '{gsub(" --> ","\x22,\x22");if(NF!=0){if(j<3)k=k"\x22"$0"\x22,";else{k="\x22"$0"\x22 ";l=1}j=j+1}else j=0;if(j==3){print k;k=""}if(l==1){print ",,,"k ;l=0;k=""}}' inputfile > output.csv
详细介绍了awk
awk '{
gsub(" --> ","\x22,\x22");
if(NF!=0)
{
if(j<3)
k=k"\x22"$0"\x22,";
else
{
k="\x22"$0"\x22 ";
l=1
}
j=j+1
}
else
j=0;
if(j==3)
{
print k;
k=""
}
if(l==1)
{
print ",,,"k;
l=0;
k=""
}
}' inputfile > output.csv
在windows平台上获取output.csv,然后使用microsoft excel打开并将其另存为.xls扩展名。