Ruby检测数据集中的簇数

时间:2013-10-01 13:45:58

标签: ruby statistics

我有一个调查应用程序,我需要聚集响应以检测连贯性或退相干的迹象。

我正在使用AI4R,我的代码如下所示(示例代码来自AI4R)

# 5 Questions on a post training survey
questions = [   "The material covered was appropriate for someone with my level of knowledge of the subject.", 
                "The material was presented in a clear and logical fashion", 
                "There was sufficient time in the session to cover the material that was presented", 
                "The instructor was respectful of students", 
                "The instructor provided good examples"]

# Answers to each question go from 1 (bad) to 5 (excellent)
# The answers array has an element per survey complemented. 
# Each survey completed is in turn an array with the answer of each question.
answers = [ 
            [ 1, 2, 3, 2, 2],   # Answers of person 1
            [ 5, 5, 3, 2, 2],   # Answers of person 2
          ]

data_set = DataSet.new(:data_items => answers, :data_labels => questions)

# Let's group answers in 4 groups
clusterer = Diana.new.build(data_set, 4)

这反过来让我创建这样的图表(调查有与主题/轴相关的问题)。

enter image description here

现在的问题是,您必须选择要传递到AI4R的群集数量。如何使用Ruby来检测集群的数量(这个问题可以归结为sstatistics学科......)。


输入弯头方法......

我在Wikipedia看到有一种称为肘法的技术(维基百科的说明图片),

enter image description here

将簇数与它们解释的方差进行比较。这种技术非常适合我的需求,但我不知道如何在Ruby中实现它。 (我作为一名本科生在当天做了ANOVA,所以我得到了他们的意思,但这就是它停止的地方。我可能还需要在统计论坛上交叉发帖)。

是否有可以帮助解决这个问题的Ruby库我还没有发现,或者如何使用Ruby生态系统来解决这个问题?

0 个答案:

没有答案