我有一个文件test.txt。我正在寻找多种模式匹配,并使用
一张一张地独立打印它们。 awk 'substr($1,5,15) ~ /ccc/ { print $0 }' test.txt >test1.txt
awk 'substr($1,5,15) ~ /abb/ { print $0 }' test.txt >test2.txt
awk 'substr($1,5,15) ~ /abc/ { print $0 }' test.txt >test3.txt
现在,我可以一次性运行它吗?像
之后 awk 'substr($1,5,15) ~ /ccc/ { print $0 }' test.txt
在与上述模式不匹配的行中,我可以运行
awk 'substr($1,5,15) ~ /abb/ { print $0 }'
以及类似的不匹配图案行
awk 'substr($1,5,15) ~ /abc/ { print $0 }'
输入文件test.txt
NNNNNabcabAAAAATCTAATCTGCCAGTT
NNNNNabcccTTTTTCTAGTCACGATAGCC
NNNNNaaabbCTAGTTTGTGTAGTAATTTT
NNNNNaaaabTTTTTTTTTTTTTTTTTTTT
NNNNNabbbbTTTTTTCACTACTGGGTTTC
NNNNNabcaaTTTTTTTTAATGGGTCTCAA
NNNNNabaccTTTTTTTTTCGGGAGGCGGG
NNNNNccaaaTTTTTTTTTTTTTATTTGAG
NNNNNabcccTTTTTTTTTACACACAATTC
NNNNNabcccTAAGACTGGCCCACAGCTGA
NNNNNabcaaTAGAGACGGGGTTTCACCAT
NNNNNabcaaTTTTTGTCGAAGATCTCACC
NNNNNabcabTTGGTAAACAGGCGGGTGTA
NNNNNabcccTACTTTTTTTAGTGATACAC
NNNNNaaabbTTTTTGCAAAAAGTAATTTG
NNNNNabcabTTTTTTTTTCTTTCTGCCTG
NNNNNabcaaTTTTGAGACAGAATCTTGCT
NNNNNaaabbTTTTTTTTTTTTTACTAGTG
NNNNNabcccTAGACAGGGAATACTTTATT
NNNNNabcabGACAGGGAATACTTATATTC
awk'substr($ 1,5,15)〜/ ccc / {print $ 0}'test.txt> test1.txt
test1.txt
NNNNNabcccTTTTTCTAGTCACGATAGCC
NNNNNabcccTTTTTTTTTACACACAATTC
NNNNNabcccTAAGACTGGCCCACAGCTGA
NNNNNabcccTACTTTTTTTAGTGATACAC
NNNNNabcccTAGACAGGGAATACTTTATT
awk'substr($ 1,5,15)〜/ abb / {print $ 0}'test.txt> test2.txt
test2.txt
NNNNNaaabbCTAGTTTGTGTAGTAATTTT
NNNNNabbbbTTTTTTCACTACTGGGTTTC
NNNNNaaabbTTTTTGCAAAAAGTAATTTG
NNNNNaaabbTTTTTTTTTTTTTACTAGTG
awk'substr($ 1,5,15)〜/ abc / {print $ 0}'test.txt> test3.txt
NNNNNabcabAAAAATCTAATCTGCCAGTT
NNNNNabcccTTTTTCTAGTCACGATAGCC
NNNNNabcaaTTTTTTTTAATGGGTCTCAA
NNNNNabcccTTTTTTTTTACACACAATTC
NNNNNabcccTAAGACTGGCCCACAGCTGA
NNNNNabcaaTAGAGACGGGGTTTCACCAT
NNNNNabcaaTTTTTGTCGAAGATCTCACC
NNNNNabcabTTGGTAAACAGGCGGGTGTA
NNNNNabcccTACTTTTTTTAGTGATACAC
NNNNNabcabTTTTTTTTTCTTTCTGCCTG
NNNNNabcaaTTTTGAGACAGAATCTTGCT
NNNNNabcccTAGACAGGGAATACTTTATT
NNNNNabcabGACAGGGAATACTTATATTC
在执行此操作时,以下行位于两个输出文件中
NNNNNabcccTAAGACTGGCCCACAGCTGA
NNNNNabcccTACTTTTTTTAGTGATACAC
NNNNNabcccTAGACAGGGAATACTTTATT
NNNNNabcccTTTTTCTAGTCACGATAGCC
NNNNNabcccTTTTTTTTTACACACAATTC
我正在寻找的是一旦打印输出,我不想再次在那些输入文件中寻找匹配的模板。我的预期输出
test1.txt
NNNNNabcccTTTTTCTAGTCACGATAGCC
NNNNNabcccTTTTTTTTTACACACAATTC
NNNNNabcccTAAGACTGGCCCACAGCTGA
NNNNNabcccTACTTTTTTTAGTGATACAC
NNNNNabcccTAGACAGGGAATACTTTATT
test2.txt
NNNNNaaabbCTAGTTTGTGTAGTAATTTT
NNNNNabbbbTTTTTTCACTACTGGGTTTC
NNNNNaaabbTTTTTGCAAAAAGTAATTTG
NNNNNaaabbTTTTTTTTTTTTTACTAGTG
test3.txt
NNNNNabcabAAAAATCTAATCTGCCAGTT
NNNNNabcaaTTTTTTTTAATGGGTCTCAA
NNNNNabcaaTAGAGACGGGGTTTCACCAT
NNNNNabcaaTTTTTGTCGAAGATCTCACC
NNNNNabcabTTGGTAAACAGGCGGGTGTA
NNNNNabcabTTTTTTTTTCTTTCTGCCTG
NNNNNabcaaTTTTGAGACAGAATCTTGCT
NNNNNabcabGACAGGGAATACTTATATTC
答案 0 :(得分:3)
要在一个awk过程中完成所有三个操作,请尝试:
awk 'substr($1,5,15) ~ /ccc/ { print>"test1.txt"}
substr($1,5,15) ~ /abb/ { print>"test2.txt"}
substr($1,5,15) ~ /abc/ { print>"test3.txt"}' test.txt
在这里,print>"test1.txt"
打印到文件test1.txt
。
请注意,>
的含义与shell
中的含义不同。在awk
中,就像在shell
中一样,文件的第一个print
将覆盖文件的先前内容。但是,与shell不同,后续使用print
的awk >
语句追加到文件。
awk 'substr($1,5,15) ~ /ccc/ { print>"test1.txt"; next}
substr($1,5,15) ~ /abb/ { print>"test2.txt"; next}
substr($1,5,15) ~ /abc/ { print>"test3.txt"}' test.txt
在这里,当找到匹配项时,next
告诉awk跳过其余测试,并跳转到下一行重新开始。
答案 1 :(得分:2)
awk '
{
str = substr($1,5,15)
out = 0
if (str ~ /ccc/) out=1
else if (str ~ /abb/) out=2
else if (str ~ /abc/) out=3
}
out { print > ("test" out ".txt") }
' test.txt
使用GNU awk,您可以使用switch语句代替嵌套的if
。
答案 2 :(得分:0)
此高尔夫假定没有同时进行的比赛。
gawk '{
match(substr($1,5,15), /(ccc)|(abb)|(abc)/, A) # probably unnecessary substring
for(i in A) n=i # get last index of A (match number)
print > "test" n ".txt" # print to variable filename
}' test.txt