我有一个像
这样的文件>TCONS_00000066 +1
PPAAARTDLSPPQHVLHVYKRYGPPRQRRRPCPQTWWWQLPHRAAATHPRGEGPRASNPTRQQHFILVYNFSSFLSSWLSLSLLSSPFCYLYICDCHGNTEDEGPLMY*LVSSSLGAFVCKDFHLIDLLDLLFWIEAGYLHAVLHTILQSGRSDR*SRPKYRLTELSVCISVRTSSVINSKC*HN
>TCONS_00000066 +2
RRLLRAPTCHHPSTSSTYTSATVHRGSVDVLVRKHGGGSFLIEQQQLILEGKGPELLILHGNNTLYLCIISLRF*VHGYLCLSYLLPFAISIFVIAMEIQKTRGR*CIDL*VLVWGLSFARIFI*LIFLICYFGSKLATFMPCCIPYFSLVGQTDDRDRSID*PNFRFVYL*GQVLSSIQNVNII
>TCONS_00000066 +3
AGCCAHRLVTTPARPPRIQALRSTAAASTSLSANMVVAASSSSSSNSSSRGRAQSF*SYTATTLYTCV*FLFVSEFMAIFVSLIFSLLLSLYL*LPWKYRRRGAADVLTCEF*FGGFRLQGFSFD*SS*FVILDRSWLPSCRVAYHTSVWSVRPMIETEVSINRTFGLYICEDKFCHQFKMLT*
>TCONS_00000066 -1
YYVNILN**QNLSSQIYKPKVRLIDTSVSIIGLTDQTEVWYATRHEGSQLRSKITNQEDQSNENPCKRKPPN*NSQVNTSAAPRLLYFHGNHKYRDSKREKIRETKIAMNSETKRNYTQV*SVVAV*D*KLWALPLEDELLLLDEEAATTMFADKDVDAAAVDRSACIRGGRAGVVTSRCAQQPA
>TCONS_00000066 -2
IMLTF*IDDRTCPHRYTNRKFG*SILRSRSSV*PTRLKYGMQHGMKVASFDPK*QIKKINQMKILANESPQTRTHKSIHQRPLVFCISMAITNIEIAKGRR*ERQR*P*TQKRREIIHKYKVLLPCRIRSSGPFPSRMSCCCSMRKLPPPCLRTRTSTLPRWTVALVYVEDVLGW*QVGARSSRR
>TCONS_00000066 -3
LC*HFELMTELVLTDIQTESSVNRYFGLDHRSDRPD*SMVCNTA*R*PASIQNNKSRRSIK*KSLQTKAPKLELTSQYISGPSSSVFPWQSQI*R*QKGEDKRDKDSHELRNEEKLYTSIKCCCRVGLEALGPSPRG*VAAAR*GSCHHHVCGQGRRRCRGGP*RLYTWRTCWGGDKSVRAAAG
>TCONS_00000130 +1
LPARPRLQGALQRHRGGKPINQSINQWW*LGQLKTKKERSN*SSC*IVKWYAGEGGDSGSGGGGRGDGGGDGEPARRHHARRRPPRQELPLQVDEPVRANEEGWVQGSWHQAARHGTGRFLQRRAHPNRDHQFARTTA*NPLPNVHPSAGRAMEKKIKGKEEKMKSPCITN*FVMMQAAVRVRSSLIGSIR*ICFTKGATDRLSWLAVWVHIHTTQTQILTI*PFAKNIFTNEQLPKLISNLTLLLNAKSCGAEFRHLSAK*YGAECTLAR*LSLPSAVARHSAPADVALRCLSSAPHDLALSKKVRFEISFGSGSFVKLVFTKG*IVKICATQTHSQEDMNIK*SREGHGFSPGFVPFGCTCTEMIYVVGLTDTKEHM***MIFVLLCQSFTLVFLTCFLSSTVVLRIQ*PQLMRLKWILAN*AYSLIFWLMVIL
>TCONS_00000130 +2
FQLALAFRELCNGIAEVNQSTNQSINGGSWVNSKQRKKEAINHLVEL*NGMQAKVEIVVREGEVGETVVATVNQLAATTLVVGLHDKSFLYRSTNPYERMRRVGCRVLGIRQHATARDGSFNAELTQIETINLHVPPPKIPFPMFTLPLGVLWRKRSKAKKRK*SHHASQINL**CRLQCELGAH*LDQSDEFVLPKEQLTD*AG*LSGYIYTRHKHKF*QFNPLQKIFLQMNSYQNLFQI*PFCLTPNRVALNLDTSAPNSMALNVRWHADLVSHPPWHGIQRQLTWR*GV*VPRHMI*R*AKRSDLK*VLAAVHL*N*FLQRVKLSKFVRHKHTHKKT*TSSEAGRGTVSHLDLCHLVVLVQR*SMLLD*QTPRNTCSSK*FLFYFVKVLHLYS*PVSCLAQ*C*EFSNLS**D*NGYWPIKLIASSFGLWLYL
>TCONS_00000130 +3
SSSPSPSGSSATASRR*TNQPINQSMVVVGSTQNKERKKQLIILLNCEMVCRRRWR*WFGRGRSGRRWWRR*TSSPPPRSSSASTTRASSTGRRTRTSE*GGLGAGFLASGSTPRHGTVPSTPSSPKSRPSICTYHRLKSPSQCSPFRWACYGEKDQRQRRENEVTMHHKLICDDAGCSAS*ELTDWINPMNLFYQRSN*QIELASCLGTYTHDTNTNFDNLTLCKKYFYK*TATKTYFKSDPFA*RQIVWR*I*TPQRQIVWR*MYVGTLT*SPIRRGTAFSAS*RGAEVSKFRAT*FSVEQKGQI*NKFWQRFICKISFYKGLNCQNLCDTNTLTRRHEHQVKQGGARFLTWICAIWLYLYRDDLCCWIDRHQGTHVVVNDFCFTLSKFYTCIPDLFLV*HSSVKNSVTSVDEIKMDIGQLSL*PHLLAYGYTY
>TCONS_00000130 -1
ISITISQKMRL*A*LANIHFNLIN*GY*ILNTTVLDKKQVRNTSVKL*QSKTKIIYYYMCSLVSVNPTT*IISVQVQPNGTNPGEKPCPSLLHLMFMSSCECVCVAQILTI*PFVKTNFTNEPLPKLISNLTFLLNAKSCGAELRHLSATSAGAECRATADGRLSQRANVHSAPYYLALRCLNSAPHDLALSKRVRFEISFGSCSFVKIFFAKG*IVKICVCVVCICTQTASQLNLSVAPLVKQIHRIDPISELLTRTAACIITN*FVMHGDFIFSSLPLIFFSIARPAEG*TLGRGF*AVVRAN*WSRFG*ARR*RNRPVPWRAA*CQEPCTQPSSFARTGSSTCRGSSCRGGRRRAWWRRAGSPSPPPSPRPPPPEPLSPPSPAYHFTIQQDD*LLLSFFVLS*PNYHH*LIDWLIGLPPRCRCRAP*RRGRAG
>TCONS_00000130 -2
我想删除id行中字符串之间的空格。
新文件应该像
>TCONS_00000066_+1
PPAAARTDLSPPQHVLHVYKRYGPPRQRRRPCPQTWWWQLPHRAAATHPRGEGPRASNPTRQQHFILVYNFSSFLSSWLSLSLLSSPFCYLYICDCHGNTEDEGPLMY*LVSSSLGAFVCKDFHLIDLLDLLFWIEAGYLHAVLHTILQSGRSDR*SRPKYRLTELSVCISVRTSSVINSKC*HN
>TCONS_00000066_+2
RRLLRAPTCHHPSTSSTYTSATVHRGSVDVLVRKHGGGSFLIEQQQLILEGKGPELLILHGNNTLYLCIISLRF*VHGYLCLSYLLPFAISIFVIAMEIQKTRGR*CIDL*VLVWGLSFARIFI*LIFLICYFGSKLATFMPCCIPYFSLVGQTDDRDRSID*PNFRFVYL*GQVLSSIQNVNII
>TCONS_00000066_+3
AGCCAHRLVTTPARPPRIQALRSTAAASTSLSANMVVAASSSSSSNSSSRGRAQSF*SYTATTLYTCV*FLFVSEFMAIFVSLIFSLLLSLYL*LPWKYRRRGAADVLTCEF*FGGFRLQGFSFD*SS*FVILDRSWLPSCRVAYHTSVWSVRPMIETEVSINRTFGLYICEDKFCHQFKMLT*
>TCONS_00000066_-1
YYVNILN**QNLSSQIYKPKVRLIDTSVSIIGLTDQTEVWYATRHEGSQLRSKITNQEDQSNENPCKRKPPN*NSQVNTSAAPRLLYFHGNHKYRDSKREKIRETKIAMNSETKRNYTQV*SVVAV*D*KLWALPLEDELLLLDEEAATTMFADKDVDAAAVDRSACIRGGRAGVVTSRCAQQPA
>TCONS_00000066_-2
IMLTF*IDDRTCPHRYTNRKFG*SILRSRSSV*PTRLKYGMQHGMKVASFDPK*QIKKINQMKILANESPQTRTHKSIHQRPLVFCISMAITNIEIAKGRR*ERQR*P*TQKRREIIHKYKVLLPCRIRSSGPFPSRMSCCCSMRKLPPPCLRTRTSTLPRWTVALVYVEDVLGW*QVGARSSRR
>TCONS_00000066_-3
LC*HFELMTELVLTDIQTESSVNRYFGLDHRSDRPD*SMVCNTA*R*PASIQNNKSRRSIK*KSLQTKAPKLELTSQYISGPSSSVFPWQSQI*R*QKGEDKRDKDSHELRNEEKLYTSIKCCCRVGLEALGPSPRG*VAAAR*GSCHHHVCGQGRRRCRGGP*RLYTWRTCWGGDKSVRAAAG
>TCONS_00000130_+1
LPARPRLQGALQRHRGGKPINQSINQWW*LGQLKTKKERSN*SSC*IVKWYAGEGGDSGSGGGGRGDGGGDGEPARRHHARRRPPRQELPLQVDEPVRANEEGWVQGSWHQAARHGTGRFLQRRAHPNRDHQFARTTA*NPLPNVHPSAGRAMEKKIKGKEEKMKSPCITN*FVMMQAAVRVRSSLIGSIR*ICFTKGATDRLSWLAVWVHIHTTQTQILTI*PFAKNIFTNEQLPKLISNLTLLLNAKSCGAEFRHLSAK*YGAECTLAR*LSLPSAVARHSAPADVALRCLSSAPHDLALSKKVRFEISFGSGSFVKLVFTKG*IVKICATQTHSQEDMNIK*SREGHGFSPGFVPFGCTCTEMIYVVGLTDTKEHM***MIFVLLCQSFTLVFLTCFLSSTVVLRIQ*PQLMRLKWILAN*AYSLIFWLMVIL
>TCONS_00000130_+2
FQLALAFRELCNGIAEVNQSTNQSINGGSWVNSKQRKKEAINHLVEL*NGMQAKVEIVVREGEVGETVVATVNQLAATTLVVGLHDKSFLYRSTNPYERMRRVGCRVLGIRQHATARDGSFNAELTQIETINLHVPPPKIPFPMFTLPLGVLWRKRSKAKKRK*SHHASQINL**CRLQCELGAH*LDQSDEFVLPKEQLTD*AG*LSGYIYTRHKHKF*QFNPLQKIFLQMNSYQNLFQI*PFCLTPNRVALNLDTSAPNSMALNVRWHADLVSHPPWHGIQRQLTWR*GV*VPRHMI*R*AKRSDLK*VLAAVHL*N*FLQRVKLSKFVRHKHTHKKT*TSSEAGRGTVSHLDLCHLVVLVQR*SMLLD*QTPRNTCSSK*FLFYFVKVLHLYS*PVSCLAQ*C*EFSNLS**D*NGYWPIKLIASSFGLWLYL
>TCONS_00000130_+3
SSSPSPSGSSATASRR*TNQPINQSMVVVGSTQNKERKKQLIILLNCEMVCRRRWR*WFGRGRSGRRWWRR*TSSPPPRSSSASTTRASSTGRRTRTSE*GGLGAGFLASGSTPRHGTVPSTPSSPKSRPSICTYHRLKSPSQCSPFRWACYGEKDQRQRRENEVTMHHKLICDDAGCSAS*ELTDWINPMNLFYQRSN*QIELASCLGTYTHDTNTNFDNLTLCKKYFYK*TATKTYFKSDPFA*RQIVWR*I*TPQRQIVWR*MYVGTLT*SPIRRGTAFSAS*RGAEVSKFRAT*FSVEQKGQI*NKFWQRFICKISFYKGLNCQNLCDTNTLTRRHEHQVKQGGARFLTWICAIWLYLYRDDLCCWIDRHQGTHVVVNDFCFTLSKFYTCIPDLFLV*HSSVKNSVTSVDEIKMDIGQLSL*PHLLAYGYTY
>TCONS_00000130_-1
ISITISQKMRL*A*LANIHFNLIN*GY*ILNTTVLDKKQVRNTSVKL*QSKTKIIYYYMCSLVSVNPTT*IISVQVQPNGTNPGEKPCPSLLHLMFMSSCECVCVAQILTI*PFVKTNFTNEPLPKLISNLTFLLNAKSCGAELRHLSATSAGAECRATADGRLSQRANVHSAPYYLALRCLNSAPHDLALSKRVRFEISFGSCSFVKIFFAKG*IVKICVCVVCICTQTASQLNLSVAPLVKQIHRIDPISELLTRTAACIITN*FVMHGDFIFSSLPLIFFSIARPAEG*TLGRGF*AVVRAN*WSRFG*ARR*RNRPVPWRAA*CQEPCTQPSSFARTGSSTCRGSSCRGGRRRAWWRRAGSPSPPPSPRPPPPEPLSPPSPAYHFTIQQDD*LLLSFFVLS*PNYHH*LIDWLIGLPPRCRCRAP*RRGRAG
>TCONS_00000130_-2
我使用sed
和tr
但未获得所需的输出。
答案 0 :(得分:1)
使用tr
,其确切目的是让其他人替换字符。
tr ' ' '_' < file
作为额外的奖励,您可以使用s
选项挤出多个匹配项,如下所示:
tr -s ' ' '_' < file
其中有以下效果:
$ cat a
hello world this
is a sample file
$ tr -s ' ' '_' < a
hello_world_this
is_a_sample_file
当然,如果要将更改保存在原始文件中,则必须将其输出到文件中并将其移回原始文件。
答案 1 :(得分:0)
好像你正试图用_
符号替换空格。如果是,那么你可以考虑这个,
sed 's/[[:blank:]]\+/_/g' file
OR
sed 's/\(TCONS_[0-9]\{8\}\)[[:blank:]]\+/\1_/g' file
您需要捕获要保留的字符。因此,您要保留的字符为TCONS_
+ 8digits
。因此,在捕获组\(...\)
内放置与此匹配的模式。并使用此[[:blank:]]\+
模式匹配以下一个或多个空格。您必须要转义+
,以便它会重复前一个令牌一次或多次,否则它会匹配文字+
符号,因为基本的sed使用BRE
( Baisc正则表达式)