我有一个巨大的文件结构:
>ABC_123|XX|YY|ID
CNHGYDGHA
>BBC_153|XX|YY|ID
ACGFDRER
我需要根据行
上的第一个值拆分此文件File1: ABC_123 -> should contain
>ABC_123|XX|YY|ID
CNHGYDGHA
File2: BBC_153 -> should contain
>BBC_153|XX|YY|ID
ACGFDRER
答案 0 :(得分:0)
这会从您的输入中生成两个文件ABC_123
和BBC_153
:
awk -F'|' 'NF > 1 { # when more than one field (i.e. line contains | )
close(out) # close the previous file (or do nothing, if none were open)
out = $1 # assign first field to filename
sub(/^>/, "", out) # remove the > from the start of the name
}
{ print >> out }' file # print to the file, opening in append mode if needed
如果您确定文件名只会打开一次,那么您可以使用>
代替>>
。
答案 1 :(得分:0)
awk
方法:
awk -F'|' '/^>.+\|/{ fn = substr($1, 2) }{ print > fn }' file
查看2个已创建的示例文件:
$ head [AB]BC_*
==> ABC_123 <==
>ABC_123|XX|YY|ID
CNHGYDGHA
==> BBC_153 <==
>BBC_153|XX|YY|ID
ACGFDRER