从数据框中删除重复的row.names

时间:2015-07-17 18:07:36

标签: r

请您查看下面的代码。当我使用row.names = 1加载数据时,它会抱怨重复的row.names。但是,如果我尝试删除它们,data.frame的维度是相同的??

> Seqbuster <- read.csv(file="IsomiR_Robin.csv",row.names=1,sep="\t",header=T)

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  duplicate 'row.names' are not allowed

Seqbuster <- read.csv(file="IsomiR_Robin.csv",sep="\t",header=T)

> dim(Seqbuster)
[1] 23245    20

> Seqbuster <- unique(Seqbuster)

> dim(Seqbuster)
[1] 23245    20


> head(Seqbuster)
                                GeneID                    seq             name  freq           mir start end mism add   t5   t3       s5      s3    DB ambiguity
1  hsa-let-7a-3p_ATACAATCTACTGTCTTTCCT  ATACAATCTACTGTCTTTCCT seq_11740_x11739 11739 hsa-let-7a-3p    59  79    0   0 d-CT d-CT ATAACTAT TTTCCTA miRNA         1
2   hsa-let-7a-3p_ATATACAATCTACTGTCTTT   ATATACAATCTACTGTCTTT seq_18478_x18477 18477 hsa-let-7a-3p    52  71  1AC   0    0  u-C ATAACTAT  TTTCCT miRNA         1
3  hsa-let-7a-3p_ATATACAATCTACTGTCTTTC  ATATACAATCTACTGTCTTTC seq_18076_x18075 18075 hsa-let-7a-3p    52  72  1AC   0    0    0 ATAACTAT  TTTCCT miRNA         1
4 hsa-let-7a-3p_ATATACAATCTACTGTCTTTCC ATATACAATCTACTGTCTTTCC seq_21557_x21556 21556 hsa-let-7a-3p    52  73  1AC   0    0  d-C ATAACTAT  TTTCCT miRNA         1
5 hsa-let-7a-3p_ATATACAATCTACTGTCTTTCT ATATACAATCTACTGTCTTTCT seq_18561_x18560 18560 hsa-let-7a-3p    52  72  1AC u-T    0    0 ATAACTAT  TTTCCT miRNA         1
6 hsa-let-7a-3p_CCATACAATCTACTGTCTTTCT CCATACAATCTACTGTCTTTCT seq_20933_x20932 20932 hsa-let-7a-3p    52  72  2CT u-T    0    0 ATAACTAT  TTTCCT miRNA         1
         logFC     logCPM          LR     PValue       FDR
1  0.552704905 0.18660575 2.805573421 0.09393725 0.1744389
2  0.179752394 0.12956983 0.264209389 0.60724288 0.7164197
3  0.135260378 0.99331911 0.334612036 0.56295589 0.6789419
4  0.007554301 0.66189664 0.001413451 0.97000988 0.9809448
5 -0.107374125 2.56150056 0.251272726 0.61618033 0.7237010
6  0.062867911 0.04769834 0.018075575 0.89305036 0.9300404

1 个答案:

答案 0 :(得分:0)

您没有使用unique()删除第一行中的重复项。使用'1'作为row.names的参数将创建一个dataID,其中您的GeneID作为行名称。您会发现唯一的(Seqbuster $ GeneID)将小于23245。