在python中将相似的数据分组在一起

时间:2018-12-22 01:35:52

标签: python dataset

我想知道最好的分组算法是将数据分组分成不同的类/标签,甚至只是告诉我发生这种情况的X轴范围。

下图是对输出值进行预处理缩放后的数据集。 enter image description here

然后我用0去除了所有小于0的值,这将数据分为5组。这也是正确的,但是某些分组数据可以在原始数据集的最大10 0或负值之间进行分隔。

图片是完美分割的5个字符的数据集。 enter image description here

图片是半完美的分割5个字符的数据集。 enter image description here

enter image description here

什么是将这些数据样本组合在一起以实现接近5类分离的足够算法?

发生两次分离的数据集。

83,0.0
84,0.0
85,0.0
86,0.0
87,0.0
88,1.0213809414657748
89,2.292905194654561
90,3.1046416220956177
91,2.9477843257742085
92,1.683122525517483
93,0.39591262837524127
94,0.13415708475282187
95,0.03219986635744753
96,0.1184713588458412
97,0.27042683602003686
98,0.33316973964795954
99,0.43120553401690914
100,0.486105567241021
101,0.48512520192151426
102,0.5263002473279794
103,0.5557109759532477
104,0.5968860139093924
105,0.6115913819471868
106,0.4027751260092248
107,0.0
108,0.0
109,0.0
110,0.0
111,0.0
112,0.0
113,0.0
114,0.0
115,0.0
116,0.0
117,0.0
118,0.0
119,0.0
120,0.0
121,1.6850832114545737
122,2.6497554962899734
123,2.7526931172564573
124,2.3095713007816894
125,1.2429418571534792
126,0.6459038924680661
127,0.46845913304598824
128,0.3782662090808494
129,0.3968930011450685
130,0.3557179631889238
131,0.26552503922378495
132,0.25376074479354943
133,0.1282749375377041
134,0.30768044249943644
135,0.7233521864847445
136,0.6547271281913837
137,0.711587892054509
138,0.7194307401073584
139,0.5576717065922613
140,0.2821911229999519
141,0.13807850877924654
142,0.022396287665584616
143,0.0
144,0.0
145,0.0
146,0.0
147,0.0

0 个答案:

没有答案