我有两个文件:file1和file2
文件1:
1,0,0
2,1,2
file2的:
abc gdksjhkjhfkdjfljdkjldk jkm kl;lll (sds; dks; id:1;)
zxc erefdjhfkdjfljdkjldk erewr jkm kl;lll (sds; dks; id:2;)
输出:
#abc gdksjhkjhfkdjfljdkjldk jkm kl;lll (sds; dks; id:1;)
zxc erefdjhfkdjfljdkjldk erewr jkm kl;lll (sds; dks; id:2;)
如果file2中id之后的数字与file1的第一列匹配,
then: if third column in file1 is 0,print $1 of file2=abc else $1 of file=zxc
if second column in file1 is 0,insert # at beginning
另一个案例 文件1:
1,0,0
3,1,2
file2的:
abc gdksjhkjhfkdjfljdkjldk jkm kl;lll (sds; dks; id:1;)
zxc erefdjhfkdjfljdkjldk erewr jkm kl;lll (ders; dks; id:2;)
sdsd sdsdsdsddddsdjldk vbvewqr dsm wwl;awww (cvv; fgs; id:3;)
Sometimes,the files will contain different number of lines.
In that case,if column one in file1 does not match with id in file2,it has to continue checking with next line in file2
如何在不使用shellscript合并两个文件的情况下进行匹配和修改?
答案 0 :(得分:2)
GNU awk
4
使用此awk脚本:
FNR==NR{
arr[FNR][1] = $1
arr[FNR][2] = $2
arr[FNR][3] = $3
}
FNR!=NR{
val = gensub(/.*id:([0-9]+)[^0-9]*.*/, "\\1", "g", $0)
if (arr[FNR][1] == val) {
if (arr[FNR][2] == 0)
printf "#"
if (arr[FNR][3] == 0)
$1 = "a"
else
$2 = "b"
}
print $0
}
使用:awk -F '[, ]' -f script.awk file1 file2
GNU awk
3
尝试使脚本适用于早期版本的awk
:
# This awk script will perform these checks for EVERY single line:
# when FNR == NR we are in the first file
# FNR is the line number of the current file
# NR is the total number of lines passed
FNR==NR{
# save the line of file1 to array with index it's line number
arr[FNR] = $0
}
# we are now in file 2, because FNR could be 1 but NR is now 1 + lines
# in file 1
FNR!=NR{
# create an array by splitting the corresponding line of file 1
# we split using a comma: 0,1,2 => [0, 1, 2]
split(arr[FNR], vals, ",")
# use regex to extract the id number, we drop everything from the
# line besides the number after "id:"
val = gensub(/.*id:([0-9]+)[^0-9]*.*/, "\\1", "g", $0)
# if first value of line in file1 is same as ID
if (vals[1] == val) {
# if second value of line in file1 is 0
if (vals[2] == 0)
# print # at beginning of line without adding a newline
printf "#"
# if third value of line in file1 is 0
if (vals[3] == 0)
# save "a" to var, else
var = "a"
else
# save "b" to var
var = "b"
}
# now sub the first word of the line [^ \t]* by var
# and keep everything that follows (...) = \\1
# the current line is $0
# and print this modified line (now it's printed with a newline)
print gensub(/^[^ \t]*([ \t].*)/, var "\\1", "g", $0)
}
简单地运行:
awk -f script.awk file1 file2