如何在匹配的情况下标记目标文件中的行

时间:2014-07-29 17:15:25

标签: linux bash perl awk sed

我的bash脚本从文件 - /tmp/file.CSV读取每一行直到EOF

并查找此行是否匹配其他文件中的行 - /tmp/target.CSV(如果完全匹配bash脚本需要在匹配行的开头添加“+”)

例如

  line="/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11" ( from /tmp/file.CSV )

我们看到$ line与行完全匹配:

    1,ull,LINUX,"/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11",fnt,rfdr,OK ( from /tmp/target.CSV )

然后我们需要在 /tmp/target.CSV 的行上添加“+”作为

   +1,ull,LINUX,"/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11",fnt,rfdr,OK

请在我的bash脚本中建议如何使用 sed awk 或者 perl one liner 进行操作

 more /tmp/target.CSV


 1,ull,LINUX,"/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11",fnt,rfdr,OK
 2,Ama,LINUX,"/VPNfig/EME/EM8/Franlecom Eana SA/Amen",comrse,temporal,OK
 3,ArnTel,LINUX,"/VPConfig/EME/EM3/ArmenTem Armenia)/ArmenTe",Coers,FAIL
 4,Ahh,LINUX,"/VPConfig/EMA/EM/llk/AAe",Coers,FAIL
 142,ucell,LINUX,/VPNAAonfig/EMEA/EM3/Ucell/ede3fc34,Glo,G/rvrev443,OK

 more file.CSV

 /VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11
 /VPNfig/EME/EM8/Franlecom Eana SA/Amen
 /VPConfig/EME/EM3/ArmenTem Armenia)/ArmenTe
 /VPConfig/EME/EM0/TTR/Ar
 /VPNAAonfig/EMEA/EM3/Ucell/ede3fc34

我的bash代码

 while read -r line
 do

 grep -iq "$line" /tmp/target.CSV 

 if [[  $? -ne 0 ]]
 then
       echo  "$line" NOT MATCH target.CSV
      else

     sed .................

     fi
 done <  /tmp/file.CSV

预期结果示例(根据文件/tmp/target.CSV file.CSV)

     more /tmp/target.CSV


    +1,ull,LINUX,"/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11",fnt,rfdr,OK
    +2,Ama,LINUX,"/VPNfig/EME/EM8/Franlecom Eana SA/Amen",comrse,temporal,OK
    +3,ArnTel,LINUX,"/VPConfig/EME/EM3/ArmenTem Armenia)/ArmenTe",Coers,FAIL
     4,Ahh,LINUX,"/VPConfig/EMA/EM/llk/AAe",Coers,FAIL



more file.CSV

+/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11
+/VPNfig/EME/EM8/Franlecom Eana SA/Amen
+/VPConfig/EME/EM3/ArmenTem Armenia)/ArmenTe
 /VPConfig/EME/EM0/TTR/Ar
+/VPNAAonfig/EMEA/EM3/Ucell/ede3fc34

2 个答案:

答案 0 :(得分:2)

awk -F\" -v OFS=\" 'FNR==NR{ a[$0]++; next} $2 in a { $0 = "+" $0 } 1' file.csv target.csv 

输出:

+1,ull,LINUX,"/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11",fnt,rfdr,OK
+2,Ama,LINUX,"/VPNfig/EME/EM8/Franlecom Eana SA/Amen",comrse,temporal,OK
+3,ArnTel,LINUX,"/VPConfig/EME/EM3/ArmenTem Armenia)/ArmenTe",Coers,FAIL
4,Ahh,LINUX,"/VPConfig/EMA/EM/llk/AAe",Coers,FAIL

或者

awk -F\" -v OFS=\" 'FNR==NR{ a[$0]++; next} { print ($2 in a ? "+" : " ") $0 }' file.csv target.csv 
awk -F\" -v OFS=\" 'FNR==NR{ a[$0]++; next} { $0 = ($2 in a ? "+" : " ") $0 } 1' file.csv target.csv

输出:

+1,ull,LINUX,"/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11",fnt,rfdr,OK
+2,Ama,LINUX,"/VPNfig/EME/EM8/Franlecom Eana SA/Amen",comrse,temporal,OK
+3,ArnTel,LINUX,"/VPConfig/EME/EM3/ArmenTem Armenia)/ArmenTe",Coers,FAIL
 4,Ahh,LINUX,"/VPConfig/EMA/EM/llk/AAe",Coers,FAIL

无论每一行是否以单个空格开头,这个都是有效的:

awk -F\" -v OFS=\" 'FNR==NR{ a[$0]++; next} { sub(/^ ?/, $2 in a ? "+" : " ") } 1' file.csv target.csv

尝试

awk -F\" -v OFS=\" 'FNR==NR{ a[$0]++; next} { sub(/^ ?/, $2 in a ? "+" : " ") } 1' file.csv target.csv

更新(1)

awk -F, -v OFS=, 'FNR==NR{ sub(/[ \t\r]*$/, ""); a[$0]++; next} { t = $4; gsub(/(^"|"$)/, "", t); sub(/^[ \t]*/, t in a ? "+" : " "); } 1' file.csv target.csv 

输出:

+1,ull,LINUX,"/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11",fnt,rfdr,OK
+2,Ama,LINUX,"/VPNfig/EME/EM8/Franlecom Eana SA/Amen",comrse,temporal,OK
+3,ArnTel,LINUX,"/VPConfig/EME/EM3/ArmenTem Armenia)/ArmenTe",Coers,FAIL
 4,Ahh,LINUX,"/VPConfig/EMA/EM/llk/AAe",Coers,FAIL
+142,ucell,LINUX,/VPNAAonfig/EMEA/EM3/Ucell/ede3fc34,Glo,G/rvrev443,OK

更新(2)

awk -F, -v OFS=, 'FNR==NR{ sub(/[ \t\r]$/, ""); a[$0]++; b[FNR]=$0; next} { t = $4; gsub(/(^"|"$)/, "", t); r = " "; if (t in a) { c[t]++; r = "+" }; sub(/^[ \t]*/, r); } 1; END { for (i = 1; i in b; ++i) { t = b[i]; sub(/^[ \t]*/, t in c ? "+" : " ", t); print t > "/dev/stderr" } }' file.csv target.csv > new_target.csv 2> new_file.cs

答案 1 :(得分:-1)

试试这个Perl的一个班轮:

perl -pi -e '$_="+".$_ if($_=~m{/VPNfig/EME/EM3/Ucll/ucelobeconn/6EKoHH11}is);' /tmp/target.CSV