AWK打印字符串+ bash变量+字符串的组合

时间:2018-10-17 03:39:02

标签: awk fasta

我正在尝试使用awk使用一个隔离ID以及一个从1到n的contig编号在fasta文件中重命名contig。

Fastafile:

  >NODE_1_length_172477_cov_46.1343
  GCAGGGCGCAGTTTTTGGAGGCTTGGCAAACCCGTGAGGGAAATTTGGCAGGCAAAATTT
  TGGCGGTCGTGCCGAAAAAAGCGGAGGCGATTTCAAATAAATTGTTTTTCACACATCATC
  CCAAGCGGCAGACGGAGTTTGCAGTCGGACAAATCAGGCAAGGGCGCGCAGAGTAAGTCA

隔离ID是一个变量,因为我想对多个文件执行此操作。我可以打印出isolateIDnumber,但是我需要> isolateID_number

    for file in /dir/*.fasta
    do
        name=$(basename "$file" .fasta)
        awk '/^>/{print "'"$name"'" ++i; next}{print}' $file > rename.fasta
    done;

这给了我

 15AR07771
 GCAGGGCGCAGTTTTTGGAGGCTTGGCAAACCCGTGAGGGAAATTTGGCAGGCAAAATTT
 TGGCGGTCGTGCCGAAAAAAGCGGAGGCGATTTCAAATAAATTGTTTTTCACACATCATC
 CCAAGCGGCAGACGGAGTTTGCAGTCGGACAAATCAGGCAAGGGCGCGCAGAGTAAGTCA

所需的输出:

 >15AR0777_1
 GCAGGGCGCAGTTTTTGGAGGCTTGGCAAACCCGTGAGGGAAATTTGGCAGGCAAAATTT
 TGGCGGTCGTGCCGAAAAAAGCGGAGGCGATTTCAAATAAATTGTTTTTCACACATCATC
 CCAAGCGGCAGACGGAGTTTGCAGTCGGACAAATCAGGCAAGGGCGCGCAGAGTAAGTCA

问题是,我应该在哪里放置字符串,以便它将显示> 15AR0777_1而不是15AR07771

我尝试了以下几种变体,但没有奏效

  awk '/^>/{print ">'"$name"'" "_" ++i; next}{print}' $file > rename.fasta
  awk '/^>/{print ">'"$name"'" _++i; next}{print}' $file > rename.fasta

谢谢!

1 个答案:

答案 0 :(得分:4)

使用onHandleIntent()将Shell变量传输到awk脚本中。 awk -v awk_var="$bash_bar"

man awk:

即:

-v var=val
--assign var=val
       Assign the value val to the variable var, before execution of the program begins.  Such variable values are available to the
       BEGIN rule of an AWK program.

这里是全awk版本:

for file in dir/*.fasta
do         
    name=$(basename "$file" .fasta)
    awk -v name="$name" '/^>/{print ">" name "_" ++i; next}{print}' $file > rename.fasta
done

如果有文件awk ' FNR==1 { # new file, close old and make name for new close(f) # close the old output file n=FILENAME # get filename of the new file gsub(/^.*\/|\.fasta$/,"",n) # remove path and .fasta f="rename_" n ".fasta" # new output file } /^>/ { $0=">" n "_" ++i # >name_number } { print > f # print to output file }' dir/*.fasta # process .fasta files in dir ,脚本将生成其中的文件dir/15AR07771.fasta。 (您的版本将所有输出文件写入./rename_15AR07771.fasta,甚至没有追加,您可能要修复它。)