我正在使用R。
执行聚类分析我有一个如下所示的数据集:
geneid S1 S2 S3 S4 M3 M4 M6
ENSRNOG00000000012 0.8032270364 1.5058909297 1.0496307677 1.4168397419 0.2750070475 0.9708536543 1.1570437101
ENSRNOG00000000021 3.0250287945 3.7782085764 3.4449320489 2.7004397181 3.2464080872 3.1795110503 2.9429835982
ENSRNOG00000000024 2.0669502439 2.5210507369 2.2555007331 1.7949356628 1.4382928516 1.9373443922 1.5210507369
ENSRNOG00000000033 2.7004397181 2.4724877715 2.1391420191 2.1309308698 1.8032270364 1.8757800631 1.7527485914
ENSRNOG00000000034 1.4541758932 1.3617683594 0.9963887464 0.7136958148 0.8718436485 0.6690267655 0.516015147
ENSRNOG00000000040 4.9420452599 5.0565835284 5.3527938294 4.8639384504 4.0891591319 4.2742616613 3.1731274335
ENSRNOG00000000041 2.6194130106 3.2637856139 3.4489009511 3.2032011563 3.7015490569 3.5410191531 3.0976107966
ENSRNOG00000000042 4.1263947376 4.6284819944 3.9731520379 3.014355293 3.0018022426 2.8972404256 2.5285713189
ENSRNOG00000000043 5.1051751923 5.7436226761 6.3211163506 6.5046203924 6.6071823374 6.2467880938 5.8371863852
ENSRNOG00000000044 3.2854022189 4.0465783666 4.1513717763 3.9250499647 4.5316933609 4.2727697324 3.7980505148
ENSRNOG00000000047 2.5248159284 1.8933622108 1.5210507369 1.0908534305 1.6229303509 1.9523335664 2.0976107966
ENSRNOG00000000048 3.5722833667 3.8569856898 3.8841094514 3.7202784652 4.2311251579 3.8399595875 3.6028844087
ENSRNOG00000000054 2.0823619696 2.6241008946 2.5058909297 1.3729520979 0.748461233 0.9927684308 0.8073549221
ENSRNOG00000000062 3.846994687 4.0609120496 4.1647058402 3.6644828404 3.6496154591 3.2957230245 3.1602748314
ENSRNOG00000000064 4.971543554 4.9993235782 5.1185258489 4.194559886 3.8639384504 4.2883585622 4.0531113365
ENSRNOG00000000066 3.2809563138 4.0413306068 4.0759604132 3.5422580498 3.7495342677 2.9411063109 2.6040713237
ENSRNOG00000000068 3.2986583156 3.5204222485 3.7436226761 3.3132458518 3.6427015718 3.4019034716 3.166715445
ENSRNOG00000000070 1.5235619561 2.266036894 2.2433644257 1.6229303509 2.1009776477 2.2630344058 1.9107326619
ENSRNOG00000000073 2.6780719051 2.9269482479 1.8559896973 1.3950627995 2.0426443374 2.266036894 1.9297909977
ENSRNOG00000000075 2.8559896973 2.9392265777 2.7235585615 2.2448870591 1.5109619193 1.8718436485 1.7092906357
ENSRNOG00000000081 4.8609627979 5.1501534552 5.7869883453 5.7993463875 5.6383635059 4.5478199566 4.2764966656
ENSRNOG00000000082 4.0018022426 4.1787146412 4.2067213574 3.5285713189 3.8063240574 4.0626398283 3.2913088598
ENSRNOG00000000091 0.7697717392 1.0036022367 0.867896464 0.5459683691 1.4541758932 1.8032270364 1.7311832416
ENSRNOG00000000095 3.5410191531 3.5348086612 3.9527994779 3.408711861 3.6028844087 3.0992952043 2.8011586561
ENSRNOG00000000096 1.4568061492 1.5655971759 1.6135316529 1.7527485914 1.4594316186 1.8559896973 1.673556424
ENSRNOG00000000098 2.414135533 3.5122268865 3.5147534984 3.3015876466 4.0755326312 3.8747969659 3.187451054
ENSRNOG00000000104 2.7125957804 2.5969351424 2.5459683691 1.3219280949 1.5849625007 1.6088092427 1.3161457423
ENSRNOG00000000105 1.6016965165 1.3015876466 1.1890338244 1.516015147 0.7570232465 0.6870606883 0.6040713237
ENSRNOG00000000108 3.2854022189 3.6976626335 3.8865501473 2.6369145804 2.6040713237 2.3923174228 1.8953026213
ENSRNOG00000000111 1.6229303509 2.09592442 2.0772429989 1.7782085764 1.673556424 0.9927684308 1.2570106182
ENSRNOG00000000112 2.2078928516 2.1826922975 2.4249220882 2.0250287945 2.1110313124 2.0635029423 1.8953026213
ENSRNOG00000000121 1.9202933002 2.5273206079 2.5741015081 2.2265085298 2.582556003 2.5753123307 2.1984941536
ENSRNOG00000000122 4.1255684518 4.4299506574 4.5071603491 4.2637856139 4.34269696 3.5849625007 3.9040023163
ENSRNOG00000000123 1.7070829918 1.9616233283 2.1127001327 1.4222330007 1.9221978484 1.9708536543 1.5801454844
ENSRNOG00000000127 2.3881895372 3.0347439493 2.9981955032 3.2295879227 4.0435194937 3.7729413378 3.2957230245
ENSRNOG00000000129 2.3074285252 2.979110755 3.1992797213 2.2203299549 3.6322682155 3.8982083525 3.5801454844
ENSRNOG00000000130 4.1622906135 4.7150696794 4.8733210629 3.9772799235 4.5849625007 4.9236246114 4.7739963251
ENSRNOG00000000133 3.2000648615 3.1168637577 3.1787146412 2.9579145986 2.7928553524 2.6780719051 2.2078928516
ENSRNOG00000000138 0.516015147 0.5993177937 1.0356239097 1.5849625007 2.2326607568 1.9745293125 2.0285691522
ENSRNOG00000000142 2.9278964537 2.3291235963 0.9671686075 1.4168397419 0.7048719645 1.9927684308 1.7224660245
ENSRNOG00000000145 3.2164548651 3.5490530293 3.4195388915 2.8797057663 2.3362833879 2.5849625007 2.6937657122
ENSRNOG00000000150 2.6380738372 2.9708536543 3.014355293 2.6870606883 2.6158870739 2.3161457423 2.4329594073
ENSRNOG00000000151 2.7125957804 3.5484366247 3.8354188405 4.5447326559 5.6959938131 5.3077927961 5.1941658685
ENSRNOG00000000155 3.0565835284 3.9354597478 3.6803243568 3.5134907456 3.8032270364 3.8865501473 3.2494453411
ENSRNOG00000000156 3.34269696 3.2772408983 1.7761039881 1.1505596766 0.5360529002 0.2750070475 0.3334237337
ENSRNOG00000000157 1.9164766444 2.1424134379 2.054848477 1.9145645235 2.2448870591 2.3305584 1.6599245584
ENSRNOG00000000161 1.7202784652 2.0772429989 1.9945797242 1.4541758932 1.7655347464 2.1602748314 1.8757800631
ENSRNOG00000000164 3.6616356023 4.2596491206 4.0635029423 3.2494453411 3.2418401836 3.1618876824 2.2295879227
ENSRNOG00000000165 1.3504972471 1.6158870739 0.9373443922 0.4541758932 0.7311832416 4.6392321632 4.5403993056
ENSRNOG00000000166 3.3441183345 3.3603642765 3.2494453411 1.9597701552 2.2357270598 3.1456774552 2.8698714062
我正在做的是:
d=read.table("FPKM.1.SelectedSamples.txt", header=T, sep="\t", row.names=1)
dm=data.matrix(d)
log10.matrix <- log10(dm+1)
Z.log10.A.matrix <- t(scale(t(log10.matrix[idx,])))
tmp <- Z.log10.A.matrix[which(is.finite(Z.log10.A.matrix[,1])),]
length(which(!is.finite(tmp)))
fin.Z.log10.A.matrix <- tmp
set.seed(1)
km9.fin.Z.log.A.matrix <- kmeans(fin.Z.log10.A.matrix, 2, iter.max=40, nstart=10)
rowOrder <- names(sort(km9.fin.Z.log.A.matrix$cluster))
colorVector <- c("grey","purple")
clusterColors <- colorVector[ sort(km9.fin.Z.log.A.matrix$cluster)]
heatmap.2(fin.Z.log10.A.matrix[rowOrder,],trace="none",labRow=F,labCol=colnames(km9.fin.Z.log.A.matrix),col=hmcol,RowSideColors=clusterColors,Rowv=F,Colv=F,dendrogram="column",na.rm=T,main="Gene Expression")
这些命令会给我一个带有两个簇的漂亮热图。
现在,我如何提取这些群集的成员?
提前谢谢。
答案 0 :(得分:1)
使用以下方法运行k - menas算法后
km9.fin.Z.log.A.matrix <- kmeans(fin.Z.log10.A.matrix, 2, iter.max=40, nstart=10)
您可以使用km9.fin.Z.log.A.matrix$cluster
获取群集分配,其中每个样本都有一个数字,该数字是指包含它的群集。