Question

我在R工作，并且有一个大的基因表达矩阵，我按照以下内容细分为群集：

    Sample  Cluster Gene1 Gene2 Gene3...
    1        A        412  535    23
    2        A        52    969   235
    3        B        143   45     64
    4        B        535    34     75
    .....

我想对每个簇和每个其他簇之间的所有基因进行成对比较，即簇A - 簇B，AC等，然后在所有比较中挑选出一致超过定义的logFC和显着性阈值的基因，所以我在集群A，B，C等中得到一个持续过表达基因的列表。

我试图通过创建一个包含所有可能对比的矩阵来进行限制，但是有两个问题 - 首先，我个人必须为每个集群选择相关的结果表（例如AB但不是BC）），然后，手动编写所有可能的对比变得可以撤销，因为我有大约50个集群要比较。关于如何在R中工作的任何想法？

    My idea of pseudocode
    1. Define clustering vector (done)
    2. Pick two clusters to be compared
    3. Remove genes that aren't expressed in any of these 2 clusters
    4. Perform multiple t-tests for all genes between these clusters
    5. Repeat the procedure for all cluster combinations
    6. Export lists of genes that are consistently overexpressed for 
    each cluster across all comparisons

多重成对基因表达矩阵比较

0 个答案: