基于匹配列的条件字符串替换

时间:2017-09-12 04:47:38

标签: perl awk sed

我希望匹配columnX中的字符串并替换为columnY中的固定字符串。例如,如何将基于file1中的column3的字符串与下面示例中的file2进行匹配,并选择性地将file1的column2替换为固定字符串" AU"每当找到匹配时。如果未找到匹配项,则应按原样打印file1中的那些行以进行输出。 file1& file2包含超过100K这样的行。

File1中:

0,DS,"C_3363/Y"
1,DS,"C_3363/Y"
0,UU,"C_3364/Y"
1,UU,"C_3364/Y"

文件2

0, "C_3364/Y"
1, "C_3364/Y"

期望的输出:

0,DS,"C_3363/Y"
1,DS,"C_3363/Y"
0,AU,"C_3364/Y"
1,AU,"C_3364/Y"

1 个答案:

答案 0 :(得分:0)

使用单个FS的另一个gnu awk 解决方案:

$ cat tst.awk
BEGIN {FS=","}
NR==FNR && sub(/ /, "", $2) {a[$2]++; next}
($3 in a){ printf "%s,AU,%s\n", $1,$3; next}1

相同,命令行

awk -F, 'NR==FNR && sub(/[[:space:]]/,"",$2){a[$2]++; next} ($3 in a){ printf "%s,AU,%s\n", $1,$3; next}1' input2.txt input1.txt

带输入文件:

$ cat input2.txt
0, "C_3364/Y"
1, "C_3364/Y"
1, "A_3364/Y"
1, "B_3364/Y"

$ cat input1.txt
0,DS,"C_3363/Y"
1,DS,"C_3363/Y"
0,UU,"C_3364/Y"
1,UU,"C_3364/Y"

你会得到:

$ awk -F, 'NR==FNR && sub(/[[:space:]]/,"",$2){a[$2]++; next} ($3 in a){ printf "%s,AU,%s\n", $1,$3; next}1' input2.txt input1.txt
0,DS,"C_3363/Y"
1,DS,"C_3363/Y"
0,AU,"C_3364/Y"
1,AU,"C_3364/Y"