如何对DNAStringSet对象进行排序?

时间:2019-04-12 13:13:29

标签: r bioconductor

我有一个xstringset对象

 A DNAStringSet instance of length 151674
          width seq                                                                names               
    [1]     253 GAACAGCATGAATGTTAAAACTGAAATGGATG...TGATGGTTAGGTTTTCAGAAAAAGCAGAAGA LGKD01000001.1 Oc...
    [2]  150158 TATATATATATAGTCAATTCGAGGATGTTAGA...TCCGGATACTATTCCAGAGTTTCCTTGCAAA KQ415657.1 Octopu...
    [3]     619 ATAGACATACACACAAATATTTTTATATCACA...TATATACATATTTATACATATATATATATAT LGKD01000030.1 Oc...
    [4]     359 TCACCAGTGGCAGCCGCGGCTACAGCAAAAGG...CACGGGCTGTACAACGACCCTGATGACTCCG LGKD01000031.1 Oc...
    [5]     239 GAAGTGGTAAAGAGTGCGATGCGCTGAAAAAA...CTCTTTTTTCAGCGCATCGCACTCTTTACCA LGKD01000032.1 Oc...
    ...     ... ...
[151670]    2021 AAAACCTAAACATGTTAAATCAGAGATTGCAA...ATATATAAGTATATATATATATATATATATA KQ434080.1 Octopu...
[151671]     420 CCCCACCTCCACTATCAACACCACTACCACCA...GAAGAAGAAGAAGAAGAAGAAGAAGAAGAAG LGKD01700121.1 Oc...
[151672]     424 ACACACACACACACACACACACATATACATAT...GTAAATGTGTCCGTGTGTAGTAAGCATGTGT LGKD01700122.1 Oc...
[151673]     242 ATATATATATATATATATACATCAACATATAT...ATATGTAGACGTGTGTGTATATATATATATA LGKD01700123.1 Oc...
[151674]     214 CACACACACACACACACACACACACACACACA...ACTCATATGTACAACACACATTTATACGCTT LGKD01700124.1 Oc...
>  

我以降序对其进行了排序,从而获得了这一点:

> sort_oc=sort(width(oc), decreasing = TRUE)

> sort_oc[1:10]
[1] 4064693 3315273 3181678 3174068 2987449 2908116 2784626 2705535 2686354 2631168

如何获取通过排序获得的每个宽度的对应字符串?

例如,我期望这样的结果:

          width   seq                                                                names               
     [567] 4064693 GAACAGCATGAATGTTAAAACTGAAATGGATG...TGATGGTTAGGTTTTCAGAAAAAGCAGAAGA  LGKD01000001.1 Oc...           
     [350] 3315273 AAAACCTAAACATGTTAAATCAGAGATTGCAA...ATATATAAGTATATATATATATATATATATA KQ434080.1 Octopu... 

以此类推

1 个答案:

答案 0 :(得分:2)

Andrew's的答案非常接近,但是由于DNAStringSet不是data.frame,因此需要使用Biostrings::width函数(而不是常规子集)来获取宽度:< / p>

oc[order(width(oc), decreasing = T),]

这将返回相同的DNAStringSet对象,该对象按宽度降序排列