从多列文件中提取行

时间:2016-07-20 08:36:32

标签: linux awk

我有以下格式的数据集:

Identified_____ID#2357_____ReadSequence:1238  
Unknown_____0_____ReadSequence:0979  
Unknown_____0_____ReadSequence:5476  
Identified_____ID#567899_____ReadSequence:4376  

使用awk命令,如何提取ReadSequences但仅提取已识别的行(基于第一列条目)?

3 个答案:

答案 0 :(得分:2)

$ awk -F"_____" '$1=="Identified" {print $3}' test.in 
ReadSequence:1238
ReadSequence:4376

如果您只想要ReadSequence ID,gsub是您的朋友:

$ awk -F"_____" '$1=="Identified" {gsub(/^.*:/,"",$3); print $3}' test.in 
1238
4376 

答案 1 :(得分:1)

awk -F'_____' '/^Identified/ {print $NF}' file
ReadSequence:1238
ReadSequence:4376

OR

awk '/^Identified/ {split($0,a,"_____");print a[3]}' info
ReadSequence:1238
ReadSequence:4376

如果您只想读取ReadSequence的值,那么

awk -F'_____' '/^Identified/ {split($NF,a,":"); print a[2]}' file
1238
4376

答案 2 :(得分:0)

php