尝试使用SQL语法进行斜率计算

时间:2016-05-10 15:21:17

标签: netezza

我是一个相对较新的SQL程序员,我正在尝试使下面的代码在SQL中运行。该代码用于计算给定数据集的斜率,遵循与EXCEL SLOPE函数完全相同的逻辑。现在的问题是,由于聚合是嵌套的,所以不允许计数。但是如果我为计数和求和创建子查询,我将不得不对x和y进行分组,否则我的外部查询中不会有x和y来计算。

CREATE TABLE TEST (X FLOAT, Y FLOAT);

INSERT INTO TEST (X, Y) VALUES (1,4.10242258729964);
INSERT INTO TEST (X, Y) VALUES (2,4.57708865242591);
INSERT INTO TEST (X, Y) VALUES (3,5.16785670619896);
INSERT INTO TEST (X, Y) VALUES (4,6.88149559336059);

select sum((x-sum(x)/count(x))^2)/sum(((x-sum(x)/count(x))*(y-sum(y)/count(y))))
from TEST

3 个答案:

答案 0 :(得分:1)

您可以从sum(x * x)和sum(x * y)和avg(x)以及avg(y)和n计算斜率:

SELECT avg(x) AS mx,sum(x*x) AS sx2,sum(x*y) as sxy,avg(y) as my, count(x) AS n
FROM test

然后你可以使用:

SELECT (sxy-n*mx*my)/(sx2 - n* mx*mx)
FROM
(    SELECT avg(x) AS mx,sum(x*x) AS sx2,sum(x*y) as sxy,avg(y) as my, count(x) AS n
     FROM test
)

答案 1 :(得分:0)

我通常会沿着这些方向做一些事情(我保持语法尽可能简单,以避免任何可能有问题的事情,比如CTE):

CREATE TABLE #test (x FLOAT, y FLOAT);
INSERT INTO #test SELECT 1, 4.10242258729964;
INSERT INTO #test SELECT 2, 4.57708865242591;
INSERT INTO #test SELECT 3, 5.16785670619896;
INSERT INTO #test SELECT 4, 6.88149559336059;
SELECT 
    (N * SUM_XY - SUM_X * SUM_Y) / (N * SUM_X2 - SUM_X * SUM_X) AS slope
FROM 
    (
    SELECT
        COUNT(*) AS N,
        SUM(x) AS SUM_X,
        SUM(x * x) AS SUM_X2,
        SUM(y) AS SUM_Y,
        SUM(y * y) AS SUM_Y2,
        SUM(x * y) AS SUM_XY
    FROM
        #test) a;

跑了这个,然后注意到你有另一个来自“SQL Hacks”的答案。我运行了两个版本,他们得到完全相同的答案,但另一个版本更短:D

答案 2 :(得分:0)

这是您创建的SQL的工作版本:

SELECT sum((x-avgx)*(x-avgx)) / sum((x-avgx)*(y-avgy))
FROM TEST, (SELECT sum(X)/count(X) as avgx, sum(Y)/count(Y) as avgy FROM TEST) average;

我查找了excel斜率函数,它的定义有点不同:

SELECT sum((x-avgx)*(y-avgy)) / sum((x-avgx)*(x-avgx))
FROM TEST, 
    (
        SELECT 
            sum(X)/count(X) as avgx, 
            sum(Y)/count(Y) as avgy 
        FROM TEST
    ) average;

希望这是你所需要的:)