读取tsv文件并更改第一列

时间:2017-09-13 06:02:50

标签: r

我有以下文件,我只对第1列和最后一列(14)感兴趣:

sp0000001-mRNA-1    f0651baa110098a342ff92218202e4d0    1016    Pfam    PF00226 DnaJ domain 76  137 7.5E-18 T   02-05-2017  IPR001623   DnaJ domain
sp0000001-mRNA-1    f0651baa110098a342ff92218202e4d0    1016    Pfam    PF05266 Protein of unknown function (DUF724)    832 1015    3.8E-41 T   02-05-2017  IPR007930   Protein of unknown function DUF724
sp0000001-mRNA-1    f0651baa110098a342ff92218202e4d0    1016    Pfam    PF11926 Domain of unknown function (DUF3444)    419 607 2.6E-56 T   02-05-2017  IPR024593   Domain of unknown function DUF3444
sp0000005-mRNA-1    8db7c080b2bc76bf090fec8662fcae20    243 Pfam    PF01472 PUA domain  155 232 1.3E-19 T   02-05-2017  IPR002478   PUA domain  GO:0003723
sp0000006-mRNA-1    edf5c2bb6341fe44b3da447099a5b2df    282 Pfam    PF03083 Sugar efflux transporter for intercellular exchange 198 261 1.4E-15 T   02-05-2017  IPR004316   SWEET sugar transporter GO:0016021
sp0000006-mRNA-1    edf5c2bb6341fe44b3da447099a5b2df    282 Pfam    PF03083 Sugar efflux transporter for intercellular exchange 7   91  1.1E-25 T   02-05-2017  IPR004316   SWEET sugar transporter GO:0016021
sp0000006-mRNA-2    edf5c2bb6341fe44b3da447099a5b2df    282 Pfam    PF03083 Sugar efflux transporter for intercellular exchange 198 261 1.4E-15 T   02-05-2017  IPR004316   SWEET sugar transporter GO:0016021
sp0000006-mRNA-2    edf5c2bb6341fe44b3da447099a5b2df    282 Pfam    PF03083 Sugar efflux transporter for intercellular exchange 7   91  1.1E-25 T   02-05-2017  IPR004316   SWEET sugar transporter GO:0016021
sp0000006-mRNA-3    51ff56e496d48682f7af1b2478190834    235 Pfam    PF03083 Sugar efflux transporter for intercellular exchange 130 214 9.6E-24 T   02-05-2017  IPR004316   SWEET sugar transporter GO:0016021
sp0000006-mRNA-3    51ff56e496d48682f7af1b2478190834    235 Pfam    PF03083 Sugar efflux transporter for intercellular exchange 7   91  7.5E-26 T   02-05-2017  IPR004316   SWEET sugar transporter GO:0016021
sp0000007-mRNA-1    ed1eda6e176feb124dbef8934b633df0    553 Pfam    PF03106 WRKY DNA -binding domain    281 338 2.6E-26 T   02-05-2017  IPR003657   WRKY domain GO:0003700|GO:0006355|GO:0043565

结果我尝试获取以下文件:

sp0000001,n/a
sp0000005,GO:0003723
sp0000006,GO:0016021
sp0000007,GO:0003700
sp0000007,GO:0006355
sp0000007,GO:0043565

我尝试按以下方式阅读输入文件

> interproscan <- read.csv(file="ed.tsv", sep = "\t")[1,14]
Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  duplicate 'row.names' are not allowed

解决问题的最佳方法是什么?

1 个答案:

答案 0 :(得分:0)

似乎是重复的行名称。我试图保存你的tsv文件,但它没有保存为以制表符分隔的文件给我。

无论如何试试这个。行名称为NULL:

> interproscan <- read.csv(file="ed.tsv", sep = "\t", row.names=NULL)[c(1,14)]