为简单的awk命令写一个for循环(linux)

时间:2015-02-14 09:43:30

标签: linux loops awk

问题: 我试图在文件(种类)中找到多个特定行,然后在每个物种名称后仅打印第5行到新文件。我可以单独为每个物种做到这一点,但是我无法通过循环来浏览文档中的每一个物种。 例如:

awk 'c&&!--c;/species_1$/{c=5}' results.out > speciesnames

如何将此命令转换为循环以便执行以下操作(迭代文件中的每个物种):

物种1,打印第5行,标题为物种名称

物种2,打印第5行,标题为物种名称

物种n,将第5行打印到标题为物种名称的文件

任何帮助将不胜感激。我对循环的经验很少。 感谢

results.out中的数据结构示例:

Query= species_1

length=341
Score
bits
Line 5, relevant info
description
description
description
description
description
description
description
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
nucleotides
data
data
data
data
data
data

Query= species_2

length=341

.......

所需的输出到文件物种名称:

Line 5, relevant info for species 1
Line 5, relevant info for species 2
Line 5, relevant info for species n

3 个答案:

答案 0 :(得分:1)

Meybe有点像这样:

awk 'c&&!--c;/species_[0-9]+$/{c=5}' file

awk '/species_[0-9]+/{a[NR+5]} {b[NR]=$0} END {for (i in a) print b[i]}' file

这会在species点击后打印所有第5行 <{1}}输出中array的性质是随机的。

新输入后调整代码:

awk

awk 'c&&!--c;/species [0-9]+$/{c=4}' file Line 5, relevent info 和数字之间没有_,而是一个空格 你喜欢点击后的species行,而不是行4


示例数据:

5

cat file
Query= species 1
length=341
Score
bits
Line 5, relevent info
description
description
description
description
description
description
Query= species 5
length=341
Score
bits
Line 5, relevent info need this
description
description
description
description
description
Query= species 8
length=341
Score
bits
Line 5, relevent info more data
description
description
description
description
description
Query= species 6423
length=341
Score
bits
Line 5, relevent infom, yes here it is
description
description
description
description
description

最终解决方案:

awk 'c&&!--c {print i " --> " $0} /species [0-9]+$/{c=4;i=$2 FS $3}' file
species 1 --> Line 5, relevent info
species 5 --> Line 5, relevent info need this
species 8 --> Line 5, relevent info more data
species 6423 --> Line 5, relevent infom, yes here it is

答案 1 :(得分:0)

使用getline函数的方法

 awk '/^Query *= *species_[0-9]/{print $0":";for(i=1;i<=5;++i){if(getline>0 &&i==5){print}}}' file

启动循环并从匹配Query *= *species_[0-90]/

的行中获取每5行
for(i=1;i<=5;++i)

到达第5行后打印

{if(getline>0 &&i==5){print}}}'

具有

的示例文件
Query= species_1

length=341
Score
bits
Line 5, relevant info
description
description
data
data
data
data
data
data

Query= species_2

length=341
Score
bits
Line 5, relevant info
description
description
data
data
data
data
data
data

结果

Query= species_1:
Line 5, relevant info
Query= species_2:
Line 5, relevant info

答案 2 :(得分:0)

你可以做点什么吗

linenr=0
species=unknown
cat results.out | while read -r line; do
   if [[ "${line}" = Query* ]]; then
      linenr=0
      species=$(echo ${line} | cut -d= -f2)
   else
      (( linenr = linenr + 1 ))
      if [ ${linenr} -eq 5 ]; then
         echo ${line} > ${species}.out
      fi
   fi
done