R read.table()无法识别某些字段分隔符和换行符

时间:2015-03-06 07:06:49

标签: r read.table

我在阅读此标签分隔表GO_MF.txt到R

时遇到了这个问题

以下是我输入的部分内容:

YAL004W YAL004W Unknown                              
YAL005C SSA1    unfolded protein binding    ATPase activity                          
YAL007C ERP2    molecular_function                               
YAL008W FUN14   molecular_function                               
YAL009W SPO7    phosphoprotein phosphatase activity                              
YAL012W CYS3    cystathionine gamma-lyase activity                               
YAL013W DEP1    molecular_function                               
YAL014C SYN8    SNAP receptor activity                               
YAL017W PSK1    protein serine/threonine kinase activity

当我在excel中打开它时完全没问题:3606行,最多11列。

但是,当我尝试使用以下命令将此表输入到R中时:

no_col <- max(count.fields("GO_MF.txt", sep = "\t"), na.rm=T)

pop <- read.table("GO_MF.txt", sep = "\t",fill = TRUE, as.is=T, col.names=1:no_col)

我发现只有2284个障碍物。 11个变量。当我在Rstudio中View(pop)时,我发现有一个单元格似乎无法识别\ t和\ n(见下文): 我的pop[261,3]

5-flap endonuclease activity
YBR229C ROT2    alpha-glucosidase activity
YBR230C OM14    molecular_function
YBR231C SWC5    molecular_function
YBR232C YBR232C Unknown
YBR233W PBP2    mRNA binding
YBR234C ARC40   ubiquitin binding
YBR235W VHC1    ion transmembrane transporter activity
YBR236C ABD1    mRNA (guanine-N7-)-methyltransferase activity
YBR237W PRP5    RNA-dependent ATPase activity
YBR241C YBR241C substrate-specific transmembrane transporter activity
YBR242W YBR242W molecular_function
YBR243C ALG7    UDP-N-acetylglucosamine-dolichyl-phosphate N-acetylglucosaminephosphotransferase activity
YBR244W GPX2    phospholipid-hydroperoxide glutathione peroxidase activity  glutathione peroxidase activity
YBR245C ISW1    nucleosome binding  ATPase activity rDNA binding    DNA binding
YBR246W RRT2    molecular_function
YBR248C HIS7    imidazoleglycerol-phosphate synthase activity
YBR249C ARO4    3-deoxy-7-phosphoheptulonate synthase activity
YBR250W SPO23   molecular_function
YBR251W MRPS5   structural constituent of ribosome
YBR253W SRB6    molecular_function
YBR255W MTC4    molecular_function
YBR258C SHG1    histone methyltransferase activity (H3-K4 specific)
YBR260C RGD1    Rho GTPase activator activity   phosphatidylinositol-4,5-bisphosphate binding   phosphatidylinositol-3,5-bisphosphate binding   phosphatidylinositol-3-phosphate binding    phosphatidylinositol-4-phosphate binding    phosphatidylinositol-5-phosphate binding
YBR261C TAE1    N-terminal protein N-methyltransferase activity
YBR262C AIM5    molecular_function
YBR263W SHM1    glycine hydroxymethyltransferase activity
YBR264C YPT10   GTPase activity guanyl nucleotide binding
YBR266C SLM6    Unknown
YBR267W REI1    sequence-specific DNA binding
YBR269C FMP21   molecular_function
YBR270C BIT2    molecular_function
YBR271W EFM2    S-adenosylmethionine-dependent methyltransferase activity   protein-lysine N-methyltransferase activity
YBR273C UBX7    molecular_function
YBR275C RIF1    telomeric DNA binding
YBR277C YBR277C Unknown
YBR278W DPB3    double-stranded DNA binding single-stranded DNA binding DNA-directed DNA polymerase activity
YBR280C SAF1    ubiquitin-protein ligase activity
YBR281C DUG2    peptidase activity  gamma-glutamyltransferase activity  omega peptidase activity
YBR283C SSH1    protein transmembrane transporter activity  signal sequence binding
YBR284W YBR284W AMP deaminase activity  molecular_function
YBR287W YBR287W molecular_function
YBR288C APM3    molecular_function
YBR290W BSD2    protein binding
YBR291C CTP1    tricarboxylate secondary active transmembrane transporter activity
YBR293W VBA2    basic amino acid transmembrane transporter activity drug transmembrane transporter activity
YBR294W SUL1    sulfate transmembrane transporter activity
YBR297W MAL33   sequence-specific DNA binding transcription factor activity
YBR298C MAL31   alpha-glucoside:hydrogen symporter activity
YBR299W MAL32   sucrose alpha-glucosidase activity  alpha-glucosidase activity  maltose alpha-glucosidase activity
YBR301W PAU24   molecular_function
YCL001W RER1    molecular_function
YCL002C YCL002C molecular_function
YCL005W LDB16   molecular_function
YCL008C STP22   ubiquitin binding
YCL009C ILV6    enzyme regulator activity   acetolactate synthase activity
YCL010C SGF29   methylated histone residue binding  RNA polymerase II transcription factor recruiting transcription factor activity
YCL016C DCC1    molecular_function
YCL023C YCL023C Unknown
YCL029C BIK1    microtubule binding protein homodimerization activity
YCL030C HIS4    phosphoribosyl-ATP diphosphatase activity   phosphoribosyl-AMP cyclohydrolase activity  histidinol dehydrogenase activity
YCL033C MXR2    peptide-methionine (R)-S-oxide reductase activity
YCL035C GRX1    disulfide oxidoreductase activity   glutathione transferase activity    glutathione peroxidase activity
YCL037C SRO9    RNA binding
YCL042W YCL042W molecular_function
YCL044C MGR1    misfolded protein binding
YCL046W YCL046W Unknown
YCL047C POF1    ATPase activity
YCL048W SPS22   molecular_function
YCL049C YCL049C molecular_function
YCL050C APA1    bis(5-nucleosyl)-tetraphosphatase activity

有什么想法?

提前致谢!

0 个答案:

没有答案