我正在使用CUES算法执行文档聚类。根据簇距离,必须合并两个簇。当合并两个簇Ci和Cj时,Ci用Ci U Cj更新并且移除Cj并更新簇的索引。
我在Java中使用Set Class初始化集群。但是,我无法更新特定索引的群集。如果还有其他可以使用的课程,请帮助我。请在下面找到算法。谢谢。
Input: a) A set of N clusters, C = {C1,C2, ...,CN} and noc = |C|, number of clusters.
b) Ci = {di} ∀i ∈ N, where di is the ith document of the data set.
c) A similarity matrix Sim[i][j] = cluster dis(Ci,Cj), ∀i, j ∈ [1,N].
Steps of the Algorithm:
1: X ← 0, Y ← 0
2: while noc > 1 and X ≥ 0 and Y ≥ 0 do
3: min dist ← N
4: X ← −1, Y ← −1
5: for i = 1 to noc − 1 do
6: for j = i + 1 to noc do
7: if min dist ≥ cluster dis(Ci,Cj) and cluster dis(Ci,Cj) ≥ 0 then
8: min dist ← cluster dis(Ci,Cj)
9: X ← i, Y ← j
10: end if
11: end for
12: end for
13: if X ≥ 0 and Y ≥ 0 then
14: CX ← CX ∪ CY
15: Sim ← merge(Sim, i, j)
16: noc ← noc − 1
17: end if
18: end while
19: return C