将文本转换为data.frame

时间:2017-04-25 13:52:00

标签: r text dataframe

我有以下文本数据的几条记录(这是一条记录)。我想将此文本转换为data.frame。我想制作列标题,例如LOCUS,DEFINITION,ACCESSION等...... 什么是这项任务最简单的解决方案。


LOCUS       KU217831                 477 bp    DNA     linear   BCT 21-JUN-2016
DEFINITION  Candidatus 
            ribosomal RNA gene, partial sequence.
ACCESSION   KU217831
VERSION     KU217831.1  GI:972300480
KEYWORDS    .
SOURCE      Candidatus 
  ORGANISM  Candidatus 
            Bacteria; Planctomycetes; Planctomycetia; Candidatus Brocadiales;
            REFERENCE   1  (bases 1 to 477)
  AUTHORS   Zheng,Y., Jiang,X., Hou,L., Liu,M., Lin,X., Gao,J., Li,X., Yin,G.,
            Yu,C. and Wang,R.
  TITLE     Shifts in the community structure and activity of anaerobic
            ammonium oxidation bacteria along an estuarine salinity gradient:
            Shift in anammox along salinity gradient
  JOURNAL   Biogeosciences (2016) In press
REFERENCE   2  (bases 1 to 477)
  AUTHORS   Zheng,Y. and Hou,L.
  TITLE     Direct Submission
  JOURNAL   Submitted (29-NOV-2015) State Key Laboratory of Estuarine and
            Coastal Research, East China Normal University, North Zhongshan
            Road, Shanghai, Shanghai 200062, China
COMMENT     Sequences were screened for chimeras by the submitter using Qiime
            1.9.0.

           ##Assembly-Data-START##
           Sequencing Technology :: Sanger dideoxy sequencing
           ##Assembly-Data-END##

FEATURES Location/Qualifiers source 1..477 /organism="Candidatus Anammoxoglobus propionicus" /mol_type="genomic DNA" /isolation_source="Yangtze Estuary sediment" /db_xref="taxon:363279" /clone="Y4_Winter_53" /country="China" /PCR_primers="fwd_name: amx368f, fwd_seq: ttcgcaatgcccgaaagg, rev_name: amx820r, rev_seq: aaaacccctctacttagtgccc" rRNA <1..>477 /product="16S ribosomal RNA" ORIGIN
1 ttcgcaatgc ccgaaagggt gacgaagcga cgccgcgtgt gggaagaagg ccttcgggtt 61 gtaaaccact gtcaggagtt aagaaatata gaaatgttaa tagcattttt atttgactaa 121 agctccagag gaagccacgg ctaactctgt gccagcagcc gcggtaatac agaggtggca 181 agcgttgttc ggaattattg ggcgtaaaga gcacgtaggc ggccttgcaa gtcagttgtg 241 aaatccttcc gcttaacggg agaacggcgg ctgatactac agggctagag tacgggaggg 301 gagagcggaa cttctggtgg agcggtgaaa tgcgtagata tcagaaggaa cgccggcggc 361 gaaagcggct ctctggcccg aaactgacgc tgagtgtgcg aaagctaggg gagcaaacgg 421 gattagatac cccggtagtc ctagccgtaa acgatgggca ctaagtagag gggtttt

0 个答案:

没有答案