我有一个调查应用程序,我需要聚集响应以检测连贯性或退相干的迹象。
我正在使用AI4R,我的代码如下所示(示例代码来自AI4R)
# 5 Questions on a post training survey
questions = [ "The material covered was appropriate for someone with my level of knowledge of the subject.",
"The material was presented in a clear and logical fashion",
"There was sufficient time in the session to cover the material that was presented",
"The instructor was respectful of students",
"The instructor provided good examples"]
# Answers to each question go from 1 (bad) to 5 (excellent)
# The answers array has an element per survey complemented.
# Each survey completed is in turn an array with the answer of each question.
answers = [
[ 1, 2, 3, 2, 2], # Answers of person 1
[ 5, 5, 3, 2, 2], # Answers of person 2
]
data_set = DataSet.new(:data_items => answers, :data_labels => questions)
# Let's group answers in 4 groups
clusterer = Diana.new.build(data_set, 4)
这反过来让我创建这样的图表(调查有与主题/轴相关的问题)。
现在的问题是,您必须选择要传递到AI4R的群集数量。如何使用Ruby来检测集群的数量(这个问题可以归结为sstatistics学科......)。
我在Wikipedia看到有一种称为肘法的技术(维基百科的说明图片),
将簇数与它们解释的方差进行比较。这种技术非常适合我的需求,但我不知道如何在Ruby中实现它。 (我作为一名本科生在当天做了ANOVA,所以我得到了他们的意思,但这就是它停止的地方。我可能还需要在统计论坛上交叉发帖)。
是否有可以帮助解决这个问题的Ruby库我还没有发现,或者如何使用Ruby生态系统来解决这个问题?