我正在尝试使用grep从fastq文件中提取特定序列以搜索序列ID
less all_barcode03.fastq.gz
@3cb04ae7-2c7b-4da8-8d09-59edb5b8f45c_t runid=7204dc15205b93bfd6430ca0f3a0218f11ce0787 read=10 ch=120 start_time=2019-04-12T13:55:25Z
TCGGTAGCCACTTCGTTCAGTCAATTTGGGTTGTTTAACCGAGTCTTGTGTGTCCCAGTTACCAGGGTTTTCGCATTTATCGTGAAACGCTTTCGCGTTTTCGTGCGCCGCTTCAGTGATCAGTGAAGATGGGTTTGTGGTGGAATACTCTTGCTGCTCATGGCAAACTTTATGTTGGTTTTCTCATGCATTTGTTTCTCGTAATCCCATACGTCATCCAAAGTCATCTGAAAAAGAGGGAAGGGGTGGATTGTGGGTGAAATGTTGTGTACTCCTCTATAATGGGGCTCAGTTGACAAACAGGTGGAGGAGAGGATCATTTGCTTAAAGGGGTGAGTGAAGCGGAGTTTAAGGATAATTCAAGCTTTTAAAAGTGGCTTTAGAGGTAAAGGGTTAGCTCCCATGACCCACAGGATTTATAGGAGATGGCTCTGAACAAACCAGAGCCACACACACA
+
-%&&($$%%#%,-*),-5(&,$$%$%+).'-(+-4-(')%%$*+-,3...14,7/))/03.06-./-3:8.0(*,/7+*,966006.,(*(,-(&(*,./+--902/./),,,0,-/./,4(+0/,0).0-7048,(+*',*/.)*#(((.0--10764+('(%.3/+$&%&'./4'0.;:6.895778+0/*(28/),(+-/404/*'(),.16517&83+*/0/0.--033**$&'*,''*/,,,/..0.*0*0$##*((($/6&('-,.230/01/2+4,,::8719(*.4.'.26/0(*))0*+,(*+-,-+-.4765-$%&.'%.*/')(&''#-()*21,-.;+3).*,,'557686+(-7;-2:8))(&%%'*)**%&&).6&,*(.-'$'(*2+*0587:0+*+)/*/63--/*('#&)-68664&%534)/13.))'14*+**%%$$#
@69e7e435-a78c-4ec8-94cd-b0c1f3c40c11_t runid=7204dc15205b93bfd6430ca0f3a0218f11ce0787 read=15 ch=465 start_time=2019-04-12T13:55:25Z
TCGGTACTTCGTTCGGTTGGAGAAGGTGGTGTTGCCGAGTCTTGTGTCCCAGTTACCAGGGTTTTCGCATTTATCGTGGCTTGCTGCGTTTTCGTGCGCCACCGCTTCATGTGTGTGTGTGTGTCTGGTGTTATTACTCACTTGGCAAGCGTGTCTGGACAGCAGCTGTTTGAGTGTTGAGAGCGCTTCTTCTCCAGGAGAAGCGGTTGAGCCTAAGCTGAATCCCCGTCCGTCTTTATCTTCGGACATGCTCTGGATATGCCTGAGGAGGACAATGGAGGAACAGAACAGATGGATGAAGAGCTCATAAAACTGGCACACATGCATCAAAGCCCACCTTCGTCACTCTGATGACCAGTGACTGCCGTTTATTACTGCGATTTACCATGAAGTTATCTGCTTTTTGGGTCAGTTAGTGTGTGTGTGTGTGTGTGTGTGTGTGCCTTTTCTGTCCTCCAGATACTCAGTACTACAGAGGAGCTATTAATACTTACTACATCGATATGTTATGTAATATCATTCTAGCCTGCTACTCCTGTCTTCTGTATACAACTGTCGTCTGTCCCGAATAGCTCCTGGGTGCCCTCTCCTCCATAGTAGCCACAGTTACAGGAATATTACTCTTTATCATAGAAGCGGTATCTAGTAGAACAGTCCTTAGTTAAAATAATAACGGGGTGTGGGCATGTACAGCCTCTGGTATTCCGTTGCTCAGCAGAGCCTCATAACTCTCCTAGTGGCTCAGGAAGGCTGAAACAGGCTGTGTGCACCCAGCCAGCTGGAACTGTGTTTGAGTGCCATCTTGGAATACTGTTTATAAGCGCTCTTAAGTTATATGTGAGGATGGTGGTATTAGATATGGAAGTGTGTAGGAGGAGAAAGAGGAAATAGTGTCATGTTGATATGAACAGTTTGGTCAGTAAAATGAGGGCAGTAAAAAAGTGTTTTAAGCGTTTTGTCGGTCGACAATATGATAATAAAATGCATTTGGTTCACGATAACAAGAAAACAGAAAAGACCAGCAATGAATATTTAGCATTTTTTGTTTGAAAGATGAAACAAATAATTGAAATAGCTGCCAAATATTTGTGAAATGTACTAAATGGTCAGAGTGAAGATGCAGCTTTGAAAAGAAGATTCGGA
+
+&$%)'./-0,*1(&&&%#%&$(&)'%&&%$"#$&+,'*-*1+++5-73+)*/+,32/46552:/-+2025/+-057,$#$$&)/01,)433/2732'&$#&$"$'$((+*+),+,,,*+,,-11)*'&((*"0#"&*((,*.--.&.+-*,)-17861+&%'%)),73:60-/-32:++(('.')+56894,4+)./'%')%$&-,('%#41.'$%&')$0))/2.*04632,20)(+'&&,+7.97825-++**166678950-))%*+,-26-.6,*/(4.$+'+-5/0/.-02/-+)'%+73//245+(&(%%'))(&$#&&(7.:2-0;7014354398')-83/00/04:*330))&#)))-5/(-*++5#./+50-(,0765/1,,8//05/0.:0/%#$&)--+4+)+5575312+1&-')).'+&*%)(,,,((%++/,.2486112'&#$&##$%'(*+,1/)/+...+-.1312/1+**-(-.8---,*+,-.5,1,(+%..1,)--.8;441019.1780000313658;99621-,,.++)#,-.011537%#&-2,',-,86)(.''%(.2+/24,.23/./+*$)4--.0.340/+())0..62019-7:+).2(/*%),&--30/32*)&)%)$%')+2;829%*)'4:;401/,-71%.,'(*+)2837653/0-&/63861'(*-6*()5:.3--'%')',)2977&(%(%'+-/**-0727112246..*1,-..3&/.4535-3+3.00,7*%'1+12311321.35567:93&)*))'-/,2-7-.6/,..-4;6/3/&(&%**03745+-.-.::95544467..--))'*)#('*+,..(%)&'(%%&-+'++)*/1&&'$%&+*&())$()(,%+'$&'&($&'2.44:0..++#%).78*(((/1'($$&-:;98.(*00;;2-''),053.//3+&))+14-8&**,..01.2:;743425:7(,*.((+*,,-+'&*'+057,*(.53-(+3703/210.06256;.+,01.5<<5,06;:+.7)')3,$(+'4;.,*'*'*-4--)+-*)+&--,*$(+&(-$*,''/2778:;9/.857+%%'()*((*11-,)+-5-+,31/#&%$%5)-#%#
然后尝试通过搜索序列ID来显示其中一个序列:
grep '@3cb04ae7-2c7b-4da8-8d09-59edb5b8f45c_t' all_barcode03.fastq.gz
grep '*@3cb04ae7-2c7b-4da8-8d09-59edb5b8f45c_t*' all_barcode03.fastq.gz
grep @3cb04ae7-2c7b-4da8-8d09-59edb5b8f45c_t all_barcode03.fastq.gz
grep *@3cb04ae7-2c7b-4da8-8d09-59edb5b8f45c_t* all_barcode03.fastq.gz
以上所有grep命令均未返回任何结果,但是文件中有一行以@3cb04ae7-2c7b-4da8-8d09-59edb5b8f45c_t
开头
答案 0 :(得分:1)
在zgrep
个文件上使用grep
而不是.gz
。
zgrep-在可能的压缩文件中搜索正则表达式