我正在尝试修改文件以打印第二列的一部分,并在其下面的行上显示相应的序列。我尝试过awk,sed和grep,但我只收到了部分内容。
>hg19_ct_UserTrack_3545_(null) range=chr1:20802866-20802871 5'pad=0 3'pad=0 strand=+ repeatMasking=none
GATAAG
>hg19_ct_UserTrack_3545_(null) range=chr1:23866529-23866534 5'pad=0 3'pad=0 strand=+ repeatMasking=none
TTATCT
>hg19_ct_UserTrack_3545_(null) range=chr1:24345525-24345530 5'pad=0 3'pad=0 strand=+ repeatMasking=none
GATAAG
到
chr1 20802866 20802871 GATAAG
chr1 23866529 23866534 TTATCT
chr1 24345525 24345530 GATAAG
答案 0 :(得分:1)
$ sed 'N; s/.*range=\([[:alnum:]]*\):\([[:digit:]]*\)-\([[:digit:]]*\).*\n\([[:alpha:]]*\)/\1 \2 \3 \4/' test.fa
chr1 20802866 20802871 GATAAG
chr1 23866529 23866534 TTATCT
chr1 24345525 24345530 GATAAG
答案 1 :(得分:1)
另一种解决方案:
awk -F "[=: -]" '{getline a; print $3,$4,$5,a}' file
答案 2 :(得分:1)
awk -F'[=: -]' '/^>/{s=$3" "$4" "$5; next} {print s,$0}' file