在重复序列的迭代之前或之后使用AWK查找字符串

时间:2017-11-03 22:51:49

标签: awk ffmpeg

所以我有一个看起来像这样的文本文档(截断的)

[FRAME]
pkt_pts_time=0.000000
pict_type=I
[/FRAME]
[FRAME]
pkt_pts_time=0.250250
pict_type=B
[/FRAME]
[FRAME]
pkt_pts_time=0.500500
pict_type=P
[/FRAME]
[FRAME]
pkt_pts_time=0.750750
pict_type=B
[/FRAME]
[FRAME]
pkt_pts_time=0.959292
pict_type=I
[/FRAME]

此文本是使用以下命令创建的:

ffprobe -select_streams v -show_frames -show_entries frame=pkt_pts_time,pict_type,frame_number -v quiet input.mp4

如您所见,重复[Frame]到[/ Frame]序列。所以这是我计算帧数并找到哪个帧是I帧的方法。在每个序列中,“pict_type =”值会发生变化。我想知道是否有办法让我使用AWK输入迭代次数并输出前面的pkt_pts_time值,其中pict_type值等于I.

例如,如果我的帧号为3.我可以输入数字3,awk表达式将转到第三个[Frame]到[/ Frame]序列,然后从那里回头直到它找到一个“pict_type = I”字符串。然后它会看到该序列迭代的pkt_pts_time是“pkt_pts_time = 0.00000”并且它将输出0.0000

3 个答案:

答案 0 :(得分:1)

检查一下。我将解释它是如何工作的,如果你想要的话。 我通过结束标记[/FRAME]计算帧数,但可以将其更改为起始标记[FRAME]

awk -F '=' -v frame_number=3 '
$1 == "[/FRAME]" {
    frame_cnt++;    
}
$1 == "pkt_pts_time" {
    tmp_time = $2;
}
$2 == "I" {
    i_time = tmp_time;
}
frame_cnt == frame_number {
    print i_time;
    exit;
}' input.txt

I帧后具有帧编号的版本:

awk -F '=' -v frame_number=3 '
$1 == "[/FRAME]" {
    frame_cnt++;    
}
$1 == "pkt_pts_time" {
    tmp_time = $2;
}
$2 == "I" {
    i_time = tmp_time;
    i_frame_number = frame_cnt + 1;
}
frame_cnt == frame_number {
    print "The I frame time = " i_time;
    print "The I frame number + 1 = " i_frame_number + 1;
    exit;
}' input.txt

此版本打印最接近目标框架的较低和较高“I”框架值:

awk -F '=' -v frame_number=3 '
# The frame counter - each time the first field of the line 
# equals to the [FRAME] string, the counter increments.

$1 == "[FRAME]" {
    frame_cnt++;    
}
# The "tmp_time" variable is updated each time the "pkt_pts_time" occurs.
# So, it does not have fixed value, it changing each time - floating.

$1 == "pkt_pts_time" {
    tmp_time = $2;
}
# Here we are determining the nearest "I" frame, before the target frame.
# It works this way: each time the "I" frame occurs, the "i_lower" value
# updated. It happens, while we are not reach the target frame. Then, it is 
# last time, whey the "i_lower" variable is updated. So, we found the nearest
# "I" frame before the target frame.

frame_cnt <= frame_number && $2 == "I" {
    i_lower = tmp_time;
}
# Here, we are determining the nearest "I" frame, after the target frame.
# When it occurs, the lower and upper "I" frame values are printed
# and the script execution stops.
# Note, that if the upper "I" frame does not exist, the script will print nothing,
# because, the condition returns false.

frame_cnt >= frame_number && $2 == "I" {
    print "lower I = " i_lower;
    print "upper I = " tmp_time;
    exit;
}' input.txt

答案 1 :(得分:1)

另一个gawk使用记录结构

$ awk RS='\\[/FRAME\\]' '/pict_type=I/{for(i=1;i<=NF;i++) 
                                         if($i~/pkt_pts_time/) 
                                           {time=$i; break}};
                          NR==3 {split(time,t,"="); print t[2]; exit}'

存储给定类型的时间,当它的第三条记录打印出最新的时候。

答案 2 :(得分:0)

这样的声音是你所要求的,但是如果你想要与第3帧相关的东西,它不会从你的样本输入中产生任何输出,因为你的样本输入中的任何内容都不符合我的理解要求:

$ cat tst.awk
BEGIN { FS="=" }
$1=="[FRAME]" { ++frameNr }
{ frame[$1] = $2 }
$1=="[/FRAME]" {
    if ( frameNr == n ) {
        if ( frame["pict_type"] == "I" ) {
            print frame["pkt_pts_time"]
        }
    }
    delete frame
}

$ awk -v n=3 -f tst.awk file

$ awk -v n=5 -f tst.awk file
0.959292

无论如何,希望显而易见的是它正在做什么,你可以按摩它,以适应它,如果它不是你需要的。