我是一个相对较新的SQL程序员,我正在尝试使下面的代码在SQL中运行。该代码用于计算给定数据集的斜率,遵循与EXCEL SLOPE函数完全相同的逻辑。现在的问题是,由于聚合是嵌套的,所以不允许计数。但是如果我为计数和求和创建子查询,我将不得不对x和y进行分组,否则我的外部查询中不会有x和y来计算。
CREATE TABLE TEST (X FLOAT, Y FLOAT);
INSERT INTO TEST (X, Y) VALUES (1,4.10242258729964);
INSERT INTO TEST (X, Y) VALUES (2,4.57708865242591);
INSERT INTO TEST (X, Y) VALUES (3,5.16785670619896);
INSERT INTO TEST (X, Y) VALUES (4,6.88149559336059);
select sum((x-sum(x)/count(x))^2)/sum(((x-sum(x)/count(x))*(y-sum(y)/count(y))))
from TEST
答案 0 :(得分:1)
您可以从sum(x * x)和sum(x * y)和avg(x)以及avg(y)和n计算斜率:
SELECT avg(x) AS mx,sum(x*x) AS sx2,sum(x*y) as sxy,avg(y) as my, count(x) AS n
FROM test
然后你可以使用:
SELECT (sxy-n*mx*my)/(sx2 - n* mx*mx)
FROM
( SELECT avg(x) AS mx,sum(x*x) AS sx2,sum(x*y) as sxy,avg(y) as my, count(x) AS n
FROM test
)
答案 1 :(得分:0)
我通常会沿着这些方向做一些事情(我保持语法尽可能简单,以避免任何可能有问题的事情,比如CTE):
CREATE TABLE #test (x FLOAT, y FLOAT);
INSERT INTO #test SELECT 1, 4.10242258729964;
INSERT INTO #test SELECT 2, 4.57708865242591;
INSERT INTO #test SELECT 3, 5.16785670619896;
INSERT INTO #test SELECT 4, 6.88149559336059;
SELECT
(N * SUM_XY - SUM_X * SUM_Y) / (N * SUM_X2 - SUM_X * SUM_X) AS slope
FROM
(
SELECT
COUNT(*) AS N,
SUM(x) AS SUM_X,
SUM(x * x) AS SUM_X2,
SUM(y) AS SUM_Y,
SUM(y * y) AS SUM_Y2,
SUM(x * y) AS SUM_XY
FROM
#test) a;
跑了这个,然后注意到你有另一个来自“SQL Hacks”的答案。我运行了两个版本,他们得到完全相同的答案,但另一个版本更短:D
答案 2 :(得分:0)
这是您创建的SQL的工作版本:
SELECT sum((x-avgx)*(x-avgx)) / sum((x-avgx)*(y-avgy))
FROM TEST, (SELECT sum(X)/count(X) as avgx, sum(Y)/count(Y) as avgy FROM TEST) average;
我查找了excel斜率函数,它的定义有点不同:
SELECT sum((x-avgx)*(y-avgy)) / sum((x-avgx)*(x-avgx))
FROM TEST,
(
SELECT
sum(X)/count(X) as avgx,
sum(Y)/count(Y) as avgy
FROM TEST
) average;
希望这是你所需要的:)