我需要找到不共享公共标识符的文件中两个相关行之间的时间戳差异。例如:
由于每个验证过程都没有唯一标识符,我需要:
我有一个计算时间差异的功能 - 我很难过 解析并关联验证/完成验证线的每个二重奏。虽然这些行中将有数百个,但它们总是按顺序出现并且是串行处理的。所以我总是知道,当我找到“验证”时,下一次“完成验证”(无论多远)将与之对应。
我以为我可以解析所有彼此独立的行(所有“验证”到文件A中,所有“完成验证”到文件B中)然后逐行关联。这是最好的方法,还是有办法在不产生额外文件的情况下做到这一点?
答案 0 :(得分:3)
使用bash:
#!/bin/bash
while read -r a b c d e; do
[[ "$c $d" =~ Validating\ control... ]] && echo "$a $b"
[[ "$c $d $e" =~ Done\ validating\ control. ]] && echo "$a $b"
done < file
或
#!/bin/bash
while read -r a b c d e; do
[[ "$c $d" =~ Validating\ control... ]] && start="$a $b"
if [[ "$c $d $e" =~ Done\ validating\ control. ]]; then
stop="$a $b"
echo "$start"
echo "$stop"
fi
done < file
输出:
2018-01-29 15:05:11,592 2018-01-29 15:05:10,725
答案 1 :(得分:3)
awk
救援!
从匹配的行
创建时间戳对$ awk 'BEGIN {FS=OFS=","}
/Validating control/ {s=$1}
/Done validating control/{print s,$1}' file
2018-01-29 15:05:11,2018-01-29 15:05:10
也许在awk
中包含时间增量计算是有意义的。
$ awk 'BEGIN {FS=OFS=","}
/Validating control/ {s=$1}
/Done validating control/{gsub(/[:-]/," ",s);
gsub(/[:-]/," ",$1);
print mktime($1)-mktime(s)}' file
但是,您的数据处于反向时间(在启动前一秒结束),因此结果将为负秒。
如果秒后面的数字是时间戳的一部分,那么这可能会更好
$ awk -F'[, ]' '/Validating control/{s=$1":"$2;ms=$3}
/Done validating control/{t=$1":"$2;
print s ms,t $3;
gsub(/[:-]/," ",s);
gsub(/[:-]/," ",t);
print (mktime(t)+($3/1000))-(mktime(s)+(ms/1000))}' file
2018-01-29:15:05:11592 2018-01-29:15:05:10725
-0.867
答案 2 :(得分:2)
这是GNU awk中的一个,它也计算时差。示例运行两次使用相同的数据:
$ awk '
BEGIN { FS="[- :,]" } # set FS to get the timestamp parts
/alidating/ { # if matched
if(a!="") { # read the latter value and convert to epoch time:
b=mktime($1 " " $2 " " $3 " " $4 " " $5 " " $6)+($7/10^length($7))
print b-a # calculate time difference
a=b="" # reset vars for the next pair
next # skip to next record
} # below the former of two values is processed:
a=mktime($1 " " $2 " " $3 " " $4 " " $5 " " $6)+($7/(10^length($7)))
}' file file # use same test data twice
0.867
0.867
+($7/10^length($7))
处理小数部分,例如0,592将转换为592/10 ^ 3 = 592/1000 = 0.592,0,1将转换1/10 = 0.1等等。
答案 3 :(得分:1)
以下显示了一个脚本,您可以在其中将输出保存到bash
中的数组。
$ cat test.sh
#!/bin/bash
# Use sed to print only the relevant lines.
# This also reduces the number of lines to be processed by while loop
sed -n '/Validating control.../,/Done validating control/{//p}' inputFile.txt > /tmp/input_sedVersion.txt
declare -a arr1=()
declare -a arr2=()
i=0
while read -r _date _time _state
do
if [[ "$_state" =~ Validating ]]; then
arr1[$i]="$_date $_time";
else
arr2[$i]="$_date $_time";
((i++));
fi
done < /tmp/input_sedVersion.txt
echo "arr1: ${arr1[@]}"
echo "arr2: ${arr2[@]}"
# Code do something with these arrays
输出:
$ ./test.sh
arr1: 2018-01-29 15:05:11,592 2018-01-29 15:10:11,592 2018-01-29 15:15:11,592
arr2: 2018-01-29 15:05:10,725 2018-01-29 15:10:10,725 2018-01-29 15:15:11,725