SQL group by if值是否接近

时间:2013-12-05 14:49:51

标签: sql sql-server group-by subquery gaps-and-islands

Class| Value
-------------
A    | 1
A    | 2
A    | 3
A    | 10
B    | 1

我不确定使用SQL实现这一点是否切实可行。 如果值的差异小于5(或x),则将行分组(当然具有相同的类)

预期结果

Class| ValueMin | ValueMax
---------------------------
A    | 1     |   3
A    | 10    |   10
B    | 1     |   1

对于固定间隔,我们可以轻松使用“GROUP BY”。但现在分组是基于附近行的值。因此,如果值是连续的或非常接近的,它们将被“链接在一起”。

非常感谢

假设MSSQL

4 个答案:

答案 0 :(得分:2)

您正在尝试按值之间的差距对事物进行分组。最简单的方法是使用lag()函数找出差距:

select class, min(value) as minvalue, max(value) as maxvalue
from (select class, value,
             sum(IsNewGroup) over (partition by class order by value) as GroupId
      from (select class, value,
                   (case when lag(value) over (partition by class order by value) > value - 5
                         then 0 else 1
                    end) as IsNewGroup
            from t
           ) t
     ) t
group by class, groupid;

请注意,这假定SQL Server 2012使用lag()和累积总和。

答案 1 :(得分:2)

<强>更新 * 此答案不正确 *

假设您提供的表名为sd_test,以下查询将为您提供您期望的输出

简而言之,我们需要一种方法来查找前一行的值。这是使用行ID上的连接确定的。然后创建一个组以查看差异是否小于5.然后它只是常规的“分组依据”。

如果您的SQL Server版本支持使用分区的窗口函数,则代码将更具可读性。

SELECT 
A.CLASS
,MIN(A.VALUE) AS MIN_VALUE
,MAX(A.VALUE) AS MAX_VALUE
FROM
     (SELECT 
      ROW_NUMBER()OVER(PARTITION BY CLASS ORDER BY VALUE) AS ROW_ID
      ,CLASS
      ,VALUE
      FROM SD_TEST) AS A
LEFT JOIN 
     (SELECT 
       ROW_NUMBER()OVER(PARTITION BY CLASS ORDER BY VALUE) AS ROW_ID
      ,CLASS
      ,VALUE
     FROM SD_TEST) AS B
ON A.CLASS = B.CLASS AND A.ROW_ID=B.ROW_ID+1
GROUP BY A.CLASS,CASE WHEN ABS(COALESCE(B.VALUE,0)-A.VALUE)<5 THEN 1 ELSE 0 END
ORDER BY A.CLASS,cASE WHEN ABS(COALESCE(B.VALUE,0)-A.VALUE)<5 THEN 1 ELSE 0 END DESC

ps:我认为以上是ANSI兼容的。所以应该在大多数SQL变种中运行。如果不是,有人可以纠正我。

答案 2 :(得分:1)

以下是获取您所需信息的一种方法:

SELECT Under5.Class,
  (
    SELECT MIN(m2.Value) 
    FROM MyTable AS m2 
    WHERE m2.Value < 5 
      AND m2.Class = Under5.Class
  ) AS ValueMin,
  (
    SELECT MAX(m3.Value) 
    FROM MyTable AS m3 
    WHERE m3.Value < 5 
      AND m3.Class = Under5.Class
  ) AS ValueMax
FROM 
(
  SELECT DISTINCT m1.Class
  FROM MyTable AS m1 
  WHERE m1.Value < 5
) AS Under5
UNION
SELECT Over4.Class,
  (
    SELECT MIN(m4.Value) 
    FROM MyTable AS m4 
    WHERE m4.Value >= 5 
      AND m4.Class = Over4.Class
  ) AS ValueMin,
  (
    SELECT Max(m5.Value) 
    FROM MyTable AS m5 
    WHERE m5.Value >= 5 
      AND m5.Class = Over4.Class
  ) AS ValueMax
FROM 
(
  SELECT DISTINCT m6.Class
  FROM MyTable AS m6 
  WHERE m6.Value >= 5
) AS Over4

答案 3 :(得分:1)

这些给出了正确的结果,使用的事实是您必须具有相同数量的组开头作为结束,并且它们都将按升序排列。

if object_id('tempdb..#temp') is not null drop table #temp

create table #temp (class char(1),Value int);

insert into #temp values ('A',1);
insert into #temp values ('A',2);
insert into #temp values ('A',3);
insert into #temp values ('A',10);
insert into #temp values ('A',13);
insert into #temp values ('A',14);
insert into #temp values ('b',7);
insert into #temp values ('b',8);
insert into #temp values ('b',9);
insert into #temp values ('b',12);
insert into #temp values ('b',22);
insert into #temp values ('b',26);
insert into #temp values ('b',67);

方法1使用CTE和行偏移

with cte as
(select  distinct class,value,ROW_NUMBER() over ( partition by class order by value ) as R from #temp),
cte2 as
(
    select 
        c1.class
        ,c1.value
        ,c2.R as PreviousRec
        ,c3.r as NextRec
    from 
        cte c1
        left join cte c2 on (c1.class = c2.class and c1.R= c2.R+1 and c1.Value < c2.value + 5)
        left join cte c3 on (c1.class = c3.class and c1.R= c3.R-1 and c1.Value > c3.value - 5)
)

select
    Starts.Class
    ,Starts.Value as StartValue
    ,Ends.Value as EndValue
from
    (
     select 
        class
        ,value
        ,row_number() over ( partition by class order by value ) as GroupNumber
    from cte2
        where PreviousRec is null) as Starts join
    (
     select 
        class
        ,value
        ,row_number() over ( partition by class order by value ) as GroupNumber
    from cte2
        where NextRec is null) as Ends on starts.class=ends.class and starts.GroupNumber = ends.GroupNumber

**方法2使用不存在的内联视图**

select
        Starts.Class
        ,Starts.Value as StartValue
        ,Ends.Value as EndValue            
from
    (
        select class,Value ,row_number() over ( partition by class order by value ) as GroupNumber
        from
            (select distinct class,value from #temp) as T
        where not exists (select 1 from #temp where class=t.class and Value < t.Value and Value > t.Value -5 )
    ) Starts join
    (
        select class,Value ,row_number() over ( partition by class order by value ) as GroupNumber
        from
            (select distinct class,value from #temp) as T
        where not exists (select 1 from #temp where class=t.class and Value > t.Value and Value < t.Value +5 )
    ) ends on starts.class=ends.class and starts.GroupNumber = ends.GroupNumber

在这两种方法中,我都使用select distinct来开始,因为如果你在一个组的开头或结尾有一个dulpicate条目,那么没有它就会出错。