尝试使用read.table函数将数据表读入R时遇到了一个非常奇怪的问题。我没有读取实际数据,而是在标题ÿþD下获得了一列NA字符(这不在我的代码或输入文件中的任何位置)。我的代码和数据文件如下。如果您对我得到这个奇怪的结果有什么建议,请告诉我。我一直在寻找几个小时,什么都没有。
代码:
Raw_Annotation_data_AllDeer<-read.table("Sample.txt", as.is=TRUE, row.names=NULL,
check.names = TRUE, sep="\t", fill=T, header=T,
strip.white = T, quote = "", na.strings = "NA",
comment.char="")
档案(前5行):
Document_Name Sequence_Name Track_Name Type Name Sequence Minimum Min_(with_gaps) Maximum Max_(with_gaps) Length Length_(with_gaps) #_Intervals Direction Average_Quality Coverage modified_by Polymorphism_Type Strand-Bias Strand-Bias_>50%_P-value Strand-Bias_>65%_P-value Variant_Frequency Variant_Nucleotide(s) Variant_P-Value_(approximate)
Chr2_FT Chr2 Chr2.bed CDS 10000_ARHGAP15 GAAAGAATCATTAACAGTTAGAAGTTGATG-AAGTTTCAATAACAAGTGGGCACTGAGAGAAAG 55916421 56019336 55916483 56019399 63 64 1 forward User
Chr2_FT Chr2 Chr2.bed CDS 10001_ARHGAP15 GATACCACTGTACTATGCAGAAATCTACAAATTCTGATATCCCTGTGGAAACACTGAATCCCACCCGCCAAGGCACTGGAGCTGTGCAAATGAGAATCAAAAATGCCAACAGCCACCATGACAGGCTGAGCCAAAGTAAATCTATGATCCTCACCGAAGTTGGGAAGGTCACTGAACCT 55936395 56039336 55936573 56039514 179 179 1 forward User
Chr2_FT Chr2 Chr2.bed CDS 10002_HNMT CTGACACAATAATAATGAGAATCTTAGCATTGGTAGCTAAGAGACTATGGAAGAATTTCAGGGTAGCTGGGATGTCTTTAACATAATACAGCAT 61980947 62093615 61981040 62093708 94 94 1 forward User
Chr2_FT Chr2 Chr2.bed CDS 10003_HNMT CTGAATCATATGAATAAAGTCCCACCTCTGAAGTTCTTTTTTCTCCATCATTCTATTTTGATATTCAGATGATGTCTCTTTATGCCAAGCAAACTTTATGTTCTCAAGGTTTGATGTCTTTGCTACAAGCT 61986120 62098794 61986250 62098924 131 131 1 forward User
Chr2_FT Chr2 Chr2.bed CDS 10004_HNMT CTTTGTACTTGGTGATTTGTTCAGCACTTGGTTCAACAACTTCATTGATGATATGAACTCCTGGGTACTGAGCTTGCACTTTGGAGAGAATTTGAAGGTCAATTTCAC 61987773 62100453 61987880 62100560 108 108 1 forward User
答案 0 :(得分:0)
标题行中的“#”(comment.char)如果在数据中可能会有问题。但check.names()似乎抓住了它。这“有效”(删除sep="\t"
:
Raw_Annotation_data_AllDeer<-read.table(text="Document_Name Sequence_Name Track_Name Type Name Sequence Minimum Min_(with_gaps) Maximum Max_(with_gaps) Length Length_(with_gaps) #_Intervals Direction Average_Quality Coverage modified_by Polymorphism_Type Strand-Bias Strand-Bias_>50%_P-value Strand-Bias_>65%_P-value Variant_Frequency Variant_Nucleotide(s) Variant_P-Value_(approximate)
Chr2_FT Chr2 Chr2.bed CDS 10000_ARHGAP15 GAAAGAATCATTAACAGTTAGAAGTTGATG-AAGTTTCAATAACAAGTGGGCACTGAGAGAAAG 55916421 56019336 55916483 56019399 63 64 1 forward User
Chr2_FT Chr2 Chr2.bed CDS 10001_ARHGAP15 GATACCACTGTACTATGCAGAAATCTACAAATTCTGATATCCCTGTGGAAACACTGAATCCCACCCGCCAAGGCACTGGAGCTGTGCAAATGAGAATCAAAAATGCCAACAGCCACCATGACAGGCTGAGCCAAAGTAAATCTATGATCCTCACCGAAGTTGGGAAGGTCACTGAACCT 55936395 56039336 55936573 56039514 179 179 1 forward User
Chr2_FT Chr2 Chr2.bed CDS 10002_HNMT CTGACACAATAATAATGAGAATCTTAGCATTGGTAGCTAAGAGACTATGGAAGAATTTCAGGGTAGCTGGGATGTCTTTAACATAATACAGCAT 61980947 62093615 61981040 62093708 94 94 1 forward User
Chr2_FT Chr2 Chr2.bed CDS 10003_HNMT CTGAATCATATGAATAAAGTCCCACCTCTGAAGTTCTTTTTTCTCCATCATTCTATTTTGATATTCAGATGATGTCTCTTTATGCCAAGCAAACTTTATGTTCTCAAGGTTTGATGTCTTTGCTACAAGCT 61986120 62098794 61986250 62098924 131 131 1 forward User
Chr2_FT Chr2 Chr2.bed CDS 10004_HNMT CTTTGTACTTGGTGATTTGTTCAGCACTTGGTTCAACAACTTCATTGATGATATGAACTCCTGGGTACTGAGCTTGCACTTTGGAGAGAATTTGAAGGTCAATTTCAC 61987773 62100453 61987880 62100560 108 108 1 forward User ", as.is=TRUE, row.names=NULL, check.names = TRUE, fill=T, header=T, strip.white = T, quote = "", na.strings = "NA", comment.char="")