找出两条连续的线是否不同以及

时间:2016-06-20 15:47:53

标签: bash awk sed echo

如何找到固定宽度文件的两个连续行之间的差异和差异点?

示例文件:

cat test.txt
1111111111111111122211111111111111
1111111111111111132211111111111111

输出:

它应告知用户两行之间存在差异,差异的位置为:第18个字符。(如上例所示)

如果它可以列出多个变化的所有位置,那将非常有用。例如:

11111111111111111211113111
11111111111111111211114111

应该说:第18和第26个字符中出现差异。

我正在尝试以下几行,但似乎迷失了。

while read line
do

echo $line |sed 's/./ &/g' |xargs -n1 #NOt able to apply diff (stupid try)

done <test.txt

3 个答案:

答案 0 :(得分:2)

Perl救援:

$ echo '11131111111111111211113111
11111111111111111211114111' \
| perl -le '$d = <> ^ <>;
             print pos $d while $d =~ /[^\0]/g'
4
23

它对两个输入字符串进行异或,并报告结果不是空字节的所有位置,即字符串不同的位置。

答案 1 :(得分:1)

您可以使用空字段分隔符将每个字符设为awk中的字段,并将每个偶数记录的条目与奇数记录进行比较:

awk 'BEGIN{ FS="" } NR%2 {
  split($0, a)
  next
}
{
   print "line # ", NR
   for (i=1; i<=NF; i++)
      if ($i != a[i])
         print "difference spotted in position:", i
}' test.txt

line #  2
difference spotted in position: 18
line #  4
difference spotted in position: 18
difference spotted in position: 23

输入数据为:

cat test.txt

1111111111111111122211111111111111
1111111111111111132211111111111111
11111111111111111211113111
11111111111111111311114111

PS:只有在awk为空时才将记录拆分为字符的FS版本,例如GNU awk,OSX awk等。

答案 2 :(得分:1)

$ cat tst.awk
{ curr = $0 }
(NR%2)==0 {
    currLgth = length(curr)
    prevLgth = length(prev)
    maxLgth = (currLgth > prevLgth ? currLgth : prevLgth)
    print "Comparing:"
    print prev
    print curr
    for (i=1; i<=maxLgth; i++) {
        prevChar = substr(prev,i,1)
        currChar = substr(curr,i,1)
        if ( prevChar != currChar ) {
            printf "Difference: char %d line %d = \"%s\", line %d = \"%s\"\n", i, NR-1, prevChar, NR, currChar
        }
    }
    print ""
}
{ prev = curr }

$ cat file
1111111111111111122211111111111111
1111111111111111132211111111111111
11111111111111111111111111
11111111111111111111111

$ awk -f tst.awk file
Comparing:
1111111111111111122211111111111111
1111111111111111132211111111111111
Difference: char 18 line 1 = "2", line 2 = "3"

Comparing:
11111111111111111111111111
11111111111111111111111
Difference: char 24 line 3 = "1", line 4 = ""
Difference: char 25 line 3 = "1", line 4 = ""
Difference: char 26 line 3 = "1", line 4 = ""