组功能使用无效;试图找到皮尔逊相关性

时间:2013-12-22 03:56:42

标签: mysql sql pearson

我正在尝试使用sql来计算如何计算皮尔森相关系数。这是我正在使用的公式: enter image description here 这是我正在使用的表格: enter image description here

这是我到目前为止查询的内容,但它给了我这样的信息:无效使用群组功能

select first_id, second_id, movie_id, first_score, second_score,  count(*) as n, 
sum((first_score-avg(first_score))*(second_score-avg(second_score)))/
(
sqrt(sum(first_score-avg(first_score)))*
sqrt(sum(second_score-avg(second_score))))
as pearson
from connections
group by second_id

感谢您的帮助

2 个答案:

答案 0 :(得分:2)

这是一个在公式中进行计算的查询:

select sum((first_score - avg_first_score)*(second_score - avg_second_score)) /
       (sqrt(sum(pow((first_score - avg_first_score), 2)))*
        sqrt(sum(pow((second_score - avg_second_score), 2)))
       ) as r      
from connections c cross join
     (select avg(first_score) as avg_first_score, avg(second_score) as avg_second_score
      from connections
     ) const;

您的尝试存在许多问题。这会预先计算两个分数的平均值。然后,它几乎按照书面形式应用公式。

答案 1 :(得分:0)

从纯粹的语法角度来看,你的group by条款存在问题。它应列出每个非聚合列以使其正常工作。它应该是:

group by first_id, second_id, movie_id, first_score, second_score