在bash脚本中使用awk并使用变量进行解析

时间:2011-05-31 06:31:05

标签: bash variables awk

我正在尝试从一个大文件中进行查询。我在bash脚本中使用“awk”。 bash脚本从参数文件中读取一些参数(逐行),并将它们放入变量中,然后传递给awk。每个查询的结果需要存储在参数文件中指定的单独文件中:

#!/bin/bash

while IFS=\t read chr start end name
do 

echo $chr $start $end $name

awk -v "chr=$chr" -v "start=$start" -v "end=$end" '$1==chr && $3>start && $3<end && $11<5E-2 {print $0}' bigfile.out > ${name}.out

done < parameterfile

不幸的是,awk命令不会产生任何输出。任何想法可能是错的。 (基于echo命令bash变量被正确分配)。

3 个答案:

答案 0 :(得分:1)

IMHO Bash不理解IFS中的“\ t”。试试这个

while IFS=$(echo -e "\t") read chr start end name
do
        echo =$chr=$start=$end=$name=
done <<EOF
11      1       10      aaa bbb
12      3       30      ccc bbb
EOF

这个将分解制表符分隔的文本。您的变体会将所有内容分配到$chr。每次打印带有可见分隔符的变量赋值。 :)'='例如。

答案 1 :(得分:1)

关键是在IFS:

while IFS='   ' read chr start end name

单引号之间的是tab char。

答案 2 :(得分:0)

我不知道在两者之间进行bash的具体要求是什么, 但是,如果要求从文件/用户读取输入,那么这应该起作用

#!/bin/bash  
cat parameterfile |awk 'BEGIN{  
    FS="\t";  
}{  
 # If parameterfile has multiple lines, and you want to comment in them, prahaps  
 #  if($0~"^[ \t]*#")next;  
 # Will allow lines starting with # (with any amount of space or tab in the front) to be reconized  
 # as comments instead of parameters :-)  
 #  
 # read the parameter file, whatever format it may be.  
 # Here we assume parameterfile is tab separated, so inside the BEGIN{} we specify FS as tab  
 # if it is a cvs , then A[0]=split($0,A,","); and then chr=A[1]; as such.  
 chr=$1;  
 start=$2;  
 end=$3;  
 name=$4;  
 # Lets start reading the file. We could read this from parameter file, if you want, or a -v var=arg on awk  
 file_to_read_from="bigfile.out";  
 while((getline line_of_data < file_to_read_from)>0){  
    # Since I do not have psychic powers to guess the format of the input of the file, here is some example  
    # If it is separated my more than one space   
    # B[0]=split(line_of_data,B,"[ ]");  
    # If it is separated by tabs  
    B[0]=split(line_of_data,B,"\t");  

    # Check if the line matches our specified whatever  condition
    if( B[1]==chr && B[3]>start && B[3]<end && B[11]<5E-2 ){  
      # Print to whatever destination  
      print > name".out";  
    }  

 }  
 # Done reading all lines from file_to_read_from
 # Close opened file, so that we can handle millions of files  
 close(file_to_read_from);  
 # If parameterfile has multiple lines, then more is processed.
 # If you only want the first line of parameter file to be read, then
 # exit 0;
 # should get you out of here
}'