postgresql corr聚合函数返回null

时间:2017-01-23 13:28:40

标签: sql postgresql aggregate-functions correlation

我有两个完美(或完全不完美?)的相关数字,我想找到它们之间的相关性。原始场景不同且更复杂,但问题是在pg使用的相关方法中的某处。请考虑以下问题:

    WITH all_series AS (
      select t as id, 'One' as name, 1 as num from generate_series(1, 10) t
      UNION
      select t as id, 'Two' as name, 2 as num from generate_series(1, 10) t
      ORDER BY name, id
    )

    SELECT (t1.name || '|' || t2.name) as names, corr(t2.num, t1.num) c
    FROM all_series t1
    INNER JOIN all_series t2 ON t1.id = t2.id
    WHERE t1.name > t2.name
    GROUP BY (t1.name || '|' || t2.name)
    ORDER BY (t1.name || '|' || t2.name)

如果删除组并打开选择,则数字完全对齐,这应该将SOMETHING作为相关...但它给出空(甚至不为零)。

的问候,

1 个答案:

答案 0 :(得分:1)

我猜你想要生成系列的相关性,而不是常数1:

WITH all_series AS (
  select t as id, 'One' as name, 1 as num, t.val from generate_series(1, 10) t(val)
  UNION ALL
  select t as id, 'Two' as name, 2 as num, t.val from generate_series(1, 10) t(val)
  ORDER BY name, id
)
SELECT (t1.name || '|' || t2.name) as names, corr(t2.val, t1.val) c
FROM all_series t1
INNER JOIN all_series t2 ON t1.id = t2.id
WHERE t1.name > t2.name
GROUP BY (t1.name || '|' || t2.name);

您的版本正在num上进行相关,这是恒定的(" 1"或" 2")。我想NULL是由计算中的除零产生的。两个常数列的相关性应该是1,但它也是一个退化的情况。