数据聚类后如何处理?

时间:2019-04-13 03:58:06

标签: pandas scikit-learn scipy hierarchical-clustering dendrogram

我已经根据病房算法对一些种群进行了聚类,现在我想对它们进行聚类并配对每个聚类内的种群

我已经找到了写字典的方法,该字典包含所有股票以及它们与之相关的树状图的颜色,但是没有办法让所有在一定距离内聚集在一起的股票。

clusterdict = defaultdict(list)
for ind,clust in zip(den['ivl'],den['leaves']):
    clusterdict[clust].append(ind)

这是它返回的字典

 defaultdict(<type 'list'>, {0: [u'GOLD'], 1: [u'AEM'], 2: [u'CDE'], 3: 
[u'CLF'], 4: [u'FOE'], 5: [u'HL'], 6: [u'LPX'], 7: [u'MAS'], 8: [u'NEM'], 
9: [u'NUE'], 10: [u'OLN'], 11: [u'PPG'], 12: [u'MUX'], 13: [u'WY'], 14: 
[u'X'], 15: [u'KGC'], 16: [u'AKS'], 17: [u'ALB'], 18: [u'PAAS'], 19: 
[u'FCX'], 20: [u'CCJ'], 21: [u'CENX'], 22: [u'SSRM'], 23: [u'STLD'], 24: 
[u'TREX'], 25: [u'IAG'], 26: [u'EGO'], 27: [u'TRQ'], 28: [u'AUY'], 29: 
[u'NG'], 30: [u'SA'], 31: [u'HUN'], 32: [u'NGD'], 33: [u'WPM'], 34: 
[u'CF'], 35: [u'TECK'], 36: [u'LYB'], 37: [u'TROX'], 38: [u'AG'], 39: 
[u'MOS'], 40: [u'FSM'], 41: [u'PVG'], 42: [u'SLCA'], 43: [u'SAND'], 44: 
[u'AGI'], 45: [u'CSTM'], 46: [u'BTG'], 47: [u'ESI'], 48: [u'AXTA'], 49: 
[u'SUM'], 50: [u'UNVR'], 51: [u'CC'], 52: [u'AA'], 53: [u'KL'], 54: 
[u'DWDP'], 55: [u'NTR']})

如果有帮助的话,这是树状图中的链接数组

[[   5.           27.            1.19107273    2.        ]
 [  12.           28.            1.86356669    2.        ]
 [  15.           29.            2.10022495    2.        ]
 [  32.           56.            2.85571413    3.        ]
 [  25.           40.            3.2928348     2.        ]
 [  43.           46.            3.62678069    2.        ]
 [  38.           44.            3.66910652    2.        ]
 [   2.           18.            3.99048391    2.        ]
 [  57.           58.            4.43112104    4.        ]
 [  59.           62.            4.54448187    5.        ]
 [  16.           26.            4.96083261    2.        ]
 [  41.           61.            6.63829892    3.        ]
 [  19.           35.            8.17068596    2.        ]
 [  60.           66.            8.21948828    4.        ]
 [  20.           67.            8.75546161    4.        ]
 [   4.            6.            9.37382844    2.        ]
 [  22.           30.           10.72164076    2.        ]
 [  50.           54.           11.62929046    2.        ]
 [  21.           37.           12.44096076    2.        ]
 [  13.           31.           12.76859026    2.        ]
 [   0.           70.           12.98710004    5.        ]
 [  47.           64.           13.63169703    5.        ]
 [   9.           71.           14.56215301    3.        ]
 [  45.           63.           15.24100602    3.        ]
 [  65.           69.           16.07353304    9.        ]
 [  48.           78.           17.15135825    4.        ]
 [  14.           42.           18.04499759    2.        ]
 [  10.           74.           18.29581966    3.        ]
 [   7.           81.           20.16860024    5.        ]
 [   1.           33.           20.58004761    2.        ]
 [  34.           55.           21.00435166    2.        ]
 [   3.           76.           22.71769878    6.        ]
 [  68.           79.           24.00631196    5.        ]
 [  77.           80.           25.43375614   14.        ]
 [  49.           82.           26.54461207    3.        ]
 [  23.           75.           29.11645193    3.        ]
 [  72.           87.           30.22339441    8.        ]
 [   8.           85.           30.49582653    3.        ]
 [  51.           90.           30.60054445    4.        ]
 [  73.           83.           33.15254175    5.        ]
 [  11.           86.           38.35470296    3.        ]
 [  39.           92.           42.06555848    9.        ]
 [  84.           88.           51.03959914   10.        ]
 [  36.           52.           51.76022424    2.        ]
 [  91.           95.           60.06346861    8.        ]
 [  89.           93.           67.07753611   17.        ]
 [  17.          100.           83.22804338    9.        ]
 [  96.           97.           83.37433519   12.        ]
 [ 101.          103.          104.12479363   29.        ]
 [  94.          102.          123.76112823   13.        ]
 [  24.           53.          130.46110771    2.        ]
 [ 104.          106.          149.73602378   31.        ]
 [  98.          107.          183.96335166   41.        ]
 [  99.          105.          195.52651673   15.        ]
 [ 108.          109.          520.2572738    56.        ]]

0 个答案:

没有答案