如何在bash(Linux)中使用相同的平面文件更新平面文件?

时间:2013-07-24 14:52:49

标签: linux bash scripting flat-file

我有一个由|分隔的平面文件我想从平面文件中已有的信息更新。我想使用第一个和第二个的信息填充第三个字段。从第一个字段开始,我想在使用该数据与缺少第三个字段的数据进行比较时忽略最后两个数字。当匹配第二个字段时,我希望它是准确的。我不想创建一个新的平面文件。我想更新现有文件。我研究了从文件中取出前两个字段的方法,但我不知道这是否对我想要达到的目标有帮助。为了总结所有这些,我想将第一个和第二个字段与文件中的其他字段进行比较,以拉出平面文件中某些行上可能缺少的第三个字段。

awk -F'|' -v OFS='|' '{sub(/[0-9 ]+$/,"",$1)}1 {print $1 "\t" $2}' tstfile

第一个字段|第二个字段|第三个字段

原始输入:

t1ttt01|/a1

t1ttt01|/b1

t1ttt01|/c1

t1ttt03|/a1|1

t1ttt03|/b1|1

t1ttt03|/c1|1

l1ttt03|/a1|3

l1ttt03|/b1|3

l1ttt03|/c1|3

它应该做什么:

t1ttt03|/a1|1 = t1ttt01|/a1

比较t1ttt|/a1| = t1ttt|/a1

因此

t1ttt01|/a1变为t1ttt01|/a1|/1

我希望输出看起来像什么:

t1ttt01|/a1|1

t1ttt01|/b1|1

t1ttt01|/c1|1

t1ttt03|/a1|1

t1ttt03|/b1|1

t1ttt03|/c1|1

l1ttt03|/a1|3

l1ttt03|/b1|3

l1ttt03|/c1|3

1 个答案:

答案 0 :(得分:0)

awk的一种方式:

awk '

# set the input and output field separator to "|"

BEGIN{FS=OFS="|"}

# Do this action when number of fields on a line is 3 for first file only. The
# action is to strip the number portion from first field and store it as a key
# along with the second field. The value of this should be field 3

NR==FNR&&NF==3{sub(/[0-9]+$/,"",$1);a[$1$2]=$3;next} 

# For the second file if number of fields is 2, store the line in a variable
# called line. Validate if field 1 (without numbers) and 2 is present in
# our array. If so, print the line followed by "|" followed by value from array.

NF==2{line=$0;sub(/[0-9]+$/,"",$1);if($1$2 in a){print line OFS a[$1$2]};next}1
' file file

测试:

$ cat file
t1ttt01|/a1
t1ttt01|/b1
t1ttt01|/c1
t1ttt03|/a1|1
t1ttt03|/b1|1
t1ttt03|/c1|1
l1ttt03|/a1|3
l1ttt03|/b1|3
l1ttt03|/c1|3
$ awk 'BEGIN{FS=OFS="|"}NR==FNR&&NF==3{sub(/[0-9]+$/,"",$1);a[$1$2]=$3;next}NF==2{line=$0;sub(/[0-9]+$/,"",$1);if($1$2 in a){print line OFS a[$1$2]};next}1' file file
t1ttt01|/a1|1
t1ttt01|/b1|1
t1ttt01|/c1|1
t1ttt03|/a1|1
t1ttt03|/b1|1
t1ttt03|/c1|1
l1ttt03|/a1|3
l1ttt03|/b1|3
l1ttt03|/c1|3