我正在尝试使用awk从文件中提取信息。
informationfile.txt类似于:
>ENST00000342992.10 cdna:known chromosome:GRCh38:2:178525989:178807421:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
>ENST00000460472.6 cdna:known chromosome:GRCh38:2:178525989:178807423:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
>ENST00000589042.5 cdna:known chromosome:GRCh38:2:178525989:178807423:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
>ENST00000591111.5 cdna:known chromosome:GRCh38:2:178525989:178807423:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
>ENST00000425332.2 cdna:known chromosome:GRCh38:2:178663627:178667307:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
>ENST00000448510.2 cdna:known chromosome:GRCh38:2:178669625:178672418:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
>ENST00000360870.9 cdna:known chromosome:GRCh38:2:178744405:178807421:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
>ENST00000634225.1 cdna:known chromosome:GRCh38:2:178753361:178767825:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
>ENST00000436599.1 cdna:known chromosome:GRCh38:2:178786089:178794954:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
>ENST00000470257.1 cdna:known chromosome:GRCh38:2:178798495:178807408:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:retained_intron gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
>ENST00000412264.1 cdna:known chromosome:GRCh38:2:178802287:178830802:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
>ENST00000359218.9 cdna:known chromosome:GRCh38:2:178525989:178807423:-1 gene:ENSG00000155657.24 gene_biotype:protein_coding transcript_biotype:protein_coding gene_symbol:TTN description:titin [Source:HGNC Symbol;Acc:HGNC:12403]
GCAGTCGTGCATTCCCAGCCTCGCCTCGGGTGTAGGGATTGCATAGAAAAGCAAAACTAC
ACAGTCTTGACTGTGTAGTTTTGTTTTTAGGATTAGAGGCTCACCGATTCATGTCGGAGA
TGGTCAGAAAAACCAACTCTCCATAGGACGTCGTTTCAGAAGCAACCTTGGGCTTAGTCC
CACCCTTTTTAGGCACTCTTGAGAAATCAGAGTGCCTAGAAAGATGACAACTCAAGCACC
GACGTTTACGCAGCCGTTACAAAGCGTTGTGGTACTGGAGGGTAGTACCGCAACCTTTGA
GGCTCACATTAGTGGTTTTCCAGTTCCTGAGGTGAGCTGGTTTAGGGATGGCCAGGTGAT
TTCCACTTCCACTCTGCCCGGCGTGCAGATCTCCTTTAGCGATGGCCGCGCTAAACTGAC
GATCCCCGCCGTGACTAAAGCCAACAGTGGACGATATTCCCTGAAAGCCACCAATGGATC
TGGACAAGCGACTAGTACTGCTGAGCTTCTCGTGAAAGCTGAGACAGCACCACCCAACTT
CGTTCAACGACTGCAGAGCATGACCGTGAGACAAGGAAGCCAAGTGAGACTCCAAGTGAG
AGTGACTGGAATCCCTACACCTGTGGTGAAGTTCTACCGGGATGGAGCCGAAATCCAGAG
CTCCCTTGATTTCCAAATTTCACAAGAAGGCGACCTCTACAGCTTACTGATTGCAGAAGC
ATACCCTGAGGACTCAGGGACCTATTCAGTAAATGCCACCAATAGCGTTGGAAGAGCTAC
TTCGACTGCTGAATTACTGGTTCAAGGTGAAGAAGAAGTACCTGCTAAAAAGACAAAGAC
AATTGTTTCGACTGCTCAGATCTCAGAATCAAGACAAACCCGAATTGAAAAGAAGATTGA
AGCCCACTTTGATGCCAGATCAATTGCAACAGTTGAGATGGTCATAGATGGTGCCGCTGG
GCAACAGCTGCCACATAAAACACCTCCCAGGATTCCTCCGAAGCCAAAGTCAAGATCCCC
AACACCACCGTCTATTGCTGCCAAAGCACAGCTGGCTCGGCAGCAGTCCCCATCGCCCAT
AAGACACTCCCCTTCCCCGGTCAGACACGTGCGGGCACCGACCCCATCTCCGGTCAGGTC
CGTGTCTCCAGCAGCAAGAATCTCCACATCCCCCATCAGGTCTGTTAGGTCTCCATTGCT
CATGCGTAAGACTCAGGCATCCACCGTGGCCACAGGTCCTGAAGTGCCTCCCCCTTGGAA
GCAAGAGGGCTACGTGGCCTCCTCATCTGAGGCTGAGATGAGAGAGACAACGCTGACAAC
CTCTACTCAGATCAGGACAGAAGAGAGATGGGAAGGGAGATACGGTGTCCAGGAGCAAGT
GACCATCAGTGGTGCTGCGGGTGCTGCCGCCAGTGTGTCGGCCAGTGCTAGCTACGCAGC
AGAGGCTGTTGCCACTGGTGCTAAAGAGGTGAAACAAGATGCTGACAAAAGTGCAGCTGT
TGCGACTGTTGTTGCTGCCGTTGATATGGCCAGAGTGAGAGAACCAGTGATCAGCGCTGT
AGAGCAGACTGCTCAGAGGACAACCACGACTGCTGTGCACATCCAACCTGCTCAAGAACA
GGTAAGAAAGGAAGCGGAGAAGACTGCTGTAACTAAGGTAGTAGTGGCCGCCGATAAAGC
CAAGGAACAAGAATTAAAATCAAGAACCAAAGAAGTAATTACCACAAAGCAAGAGCAGAT
GCACGTAACTCATGAGCAGATAAGAAAAGAAACTGAAAAAACATTTGTACCAAAGGTAGT
AATTTCCGCAGCTAAAGCCAAAGAACAAGAAACTAGAATTTCTGAAGAAATTACTAAGAA
ACAGAAACAAGTAACTCAAGAAGCAATAAGACAGGAAACTGAGATAACTGCTGCATCCAT
GGTGGTAGTTGCCACTGCAAAGTCCACAAAACTAGAAACAGTCCCGGGAGCTCAAGAAGA
AACTACCACACAACAAGATCAAATGCACCTAAGTTATGAAAAGATAATGAAGGAAACTAG
GAAAACAGTTGTACCTAAAGTCATAGTTGCCACACCCAAAGTCAAAGAACAAGATTTAGT
headerlist.txt文件看起来完全如下:
ENST00000342992.10
ENST00000460472.6
ENST00000589042.5
ENST00000591111.5
ENST00000359218.9
ENST00000615779.4
ENST00000342175.10
我编写了awk代码,用于收集我想要定位的标头,并收集该标头及其以下信息,直到下一个标题。
我称之为:
awk -f myScript.txt <headerlist.txt> <informationfile.txt>
以下是代码:
#!/bin/awk
NR == FNR {tags[$1]; next;}
for (i in tags) { if (i ~ $0) {a=1; print; next;}}
/>/ {a=0}
a
它应该产生:
>Target Header
Information attached to header
.
.
.
但是,我收到语法错误而没有任何信息。箭头不指向任何字符只是空格。
^ Syntax Error
我该如何纠正?
答案 0 :(得分:1)
输入
$ cat HeaderList
Target Header
SomeOther Header
$ cat InfoFile
>Generic Header
Information attached to header
.
.
.
>Target Header
Information attached to header
.
.
.
>SomeOther Header
Information attached to header
.
.
.
<强>脚本
while read line
do
awk 'BEGIN{RS="\n>"}/'"$line"'/{printf ">%s\n",$0}' InfoFile
done <HeaderList
<强>输出
>Target Header
Information attached to header
.
.
.
>SomeOther Header
Information attached to header
.
.
.
答案 1 :(得分:1)
我认为这将是一个更好的解决方案
$ awk 'NR==FNR{h[$0]; next}
$0 in h{c=2}
c&&c--' headers file
>Target Header
Information attached to header
如果您的标题完全相同,则可以与等式检查匹配(h中为$ 0)并打印两行。
如果要打印到下一个标题
$ awk 'NR==FNR{h[$0]; next}
/^>/{p=0}
$0 in h{p=1}
p' headers file
>Target Header
Information attached to header
.
.
.
使用新文件布局时,需要对此脚本进行修改
$ awk 'NR==FNR{h[">"$0]; next}
/^>/{p=0}
$1 in h{p=1}
p' headers file
只要密钥(在头文件中使用)与记录的其余部分之间存在空白,这应该有效。现在,标题不会有“&gt;”前缀。