从表中选择列中的值与该列上的先前值不同的所有行

时间:2014-07-07 17:53:00

标签: sql sql-server

首先让我先介绍一下我所拥有的表格。

一列是公司ID列(整数值),另一列是格式为yyyymmdd的日期(整数值)。这两列一起唯一地标识了我的表中的条目。该表(我至少想到它的方式)由Company_id,Date。

订购

该表还有其他几个列。我将调用我对mycolumn感兴趣的那个(整数值)。也很抱歉下面的格式,但我不知道如何在这里创建一个合适的表。

Company_id   Date       mycolumn 
1            20121015   1 
1            20121113   1 
1            20130108   2 
1            20130207   2 
1            20130409   2 
1            20130815   1 
2            20050611   7 
2            20080719   7 
4            20091114   3 
4            20091215   3 
4            20100304   5 
4            20110215   5 

我感兴趣的是每个公司ID的mycolumn的变化以及变更的日期。例如,对于ID为1的公司,有2个更改(从1到2,然后从2到1),对于ID为2的公司没有更改,对于ID为4的公司,从3到5有一个更改。输出表应该是:

Company_id   Date       mycolumn 
1            20121113   1 
1            20130108   2 
1            20130409   2 
1            20130815   1 
4            20091215   3 
4            20100304   5 

我知道我可以做一个中间步骤,比如选择mycolumn值超过1的公司,然后使用join语句排除那些没有从我的表中更改的公司。但我不知道下一步该做什么......

嗯,我确实想出了一些东西,但它既凌乱又无法正常工作。我最初做的是添加2列,第一个和最后一个日期显示每个公司ID - mycolumn组合。然后我用了几个步骤来得到我想要的地方。这对于从最后一个从价值3到价值5的公司来说效果很好,但是对于第一个从1到2然后又回到1的公司来说这很糟糕......

感谢您的帮助!

6 个答案:

答案 0 :(得分:0)

试试这个:

select company_id, mycolumn, max(date) from 
tableName
group by company_id, mycolumn

干杯!!

答案 1 :(得分:0)

这不是世界上最漂亮的东西,我确信它可以优化,但这应该会给你结果:

;With Cte As
(
    Select  *, Row_Number() Over (Partition By Company_Id Order By Date) RN
    From    Table
)
Select  *
From
(
    Select  C.company_id, C.Date, C.mycolumn
    From    Cte C
    Cross Apply
    (
        Select  *
        From    Cte X
        Where   X.RN = C.RN + 1
        And     X.company_id = C.company_id 
    ) X
    Where   X.mycolumn <> C.mycolumn
    Union All
    Select  X.company_id, X.Date, X.mycolumn
    From    Cte C
    Cross Apply
    (
        Select  *
        From    Cte X
        Where   X.RN = C.RN + 1
        And     X.company_id = C.company_id 
    ) X
    Where   X.mycolumn <> C.mycolumn
) R
Order By Company_Id, Date

答案 2 :(得分:0)

DECLARE @Tbl TABLE (
    Ident INT IDENTITY(1,1),
    [ROW] INT,
    Company_id INT,
    [Date] INT,
    mycolumn INT
)

INSERT INTO @Tbl
          SELECT NULL,1,20121015,1 
    UNION SELECT NULL,1,20121113,1 
    UNION SELECT NULL,1,20130108,2 
    UNION SELECT NULL,1,20130207,2 
    UNION SELECT NULL,1,20130409,2 
    UNION SELECT NULL,1,20130815,1 
    UNION SELECT NULL,2,20050611,7 
    UNION SELECT NULL,2,20080719,7 
    UNION SELECT NULL,4,20091114,3 
    UNION SELECT NULL,4,20091215,3 
    UNION SELECT NULL,4,20100304,5 
    UNION SELECT NULL,4,20110215,5 

INSERT INTO @Tbl
    SELECT
        ROW_NUMBER() OVER(PARTITION BY Company_id ORDER BY Company_id ASC,[Date] ASC),Company_id,[Date],mycolumn
    FROM @Tbl

DELETE @Tbl WHERE [ROW] IS NULL


SELECT
    t.Company_id,t.[Date],t.mycolumn
FROM @Tbl t
INNER JOIN (
    select
        t1.Ident [Ident1],t2.Ident [Ident2]
    from @Tbl t1 
    INNER JOIN @Tbl t2 ON t1.Company_id=t2.Company_id
        AND t1.[ROW]=(t2.[ROW]-1)
        AND t1.mycolumn<>t2.mycolumn
) delta on t.Ident IN (delta.[Ident1],delta.Ident2)
ORDER BY t.Company_id ASC,t.[Date] ASC

答案 3 :(得分:0)

当我编辑这个时,其他2个答案进来了,但是我觉得它有所不同,值得包括 - 否则我浪费了所有打字:(

DECLARE @T TABLE(CompanyID INT, DateInt INT, MyCol INT);
INSERT INTO @T(CompanyID , DateInt , MyCol) 
    VALUES (1, 20121015, 1), (1, 20121113, 1), (1, 20130108, 2), (1, 20130207, 2), (1, 20130409, 2), (1, 20130815, 1 )
          , (2, 20050611, 7), (2, 20080719, 7), (4, 20091114, 3), (4, 20091215, 3), (4, 20100304, 5), (4, 20110215, 5);
with cteRanked as (
    SELECT CompanyID , DateInt , MyCol, ROW_NUMBER() OVER (PARTITION BY CompanyID ORDER BY DateInt) as RowNum
    FROM @T
), cteRuns as (
    SELECT T1.CompanyID , T1.DateInt as D1, T2.DateInt as D2
        , T1.MyCol as C1, T2.MyCol as C2
    FROM cteRanked as T1 
        INNER JOIN cteRanked as T2 ON T1.CompanyID = T2.CompanyID and T1.RowNum + 1 = T2.RowNum 
    WHERE T1.MyCol != T2.MyCol
), ctePaired as (
    SELECT CompanyID, D1 as DateInt, C1 as MyCol FROM cteRuns 
    UNION --or UNION ALL to get repeated rows when a run is 1 long
    SELECT CompanyID, D2 as DateInt, C2 as MyCol FROM cteRuns 
)SELECT * FROM ctePaired
ORDER BY CompanyID, DateInt

答案 4 :(得分:0)

您可以使用leadlag

with C
(
  select *,
         lag(mycolumn) over(partition by company_id order by Date) as lagmycolumn,
         lead(mycolumn) over(partition by company_id order by Date) as leadmycolumn
  from YourTable
)
select company_id, Date, mycolumn
from C
where mycolumn <> lagmycolumn or
      mycolumn <> leadmycolumn

答案 5 :(得分:0)

由于您使用的是SQL Server 2014,因此这是我查找查询结果的最有效方法。我怀疑有一种更好的方法可以忽略&#34;公司ID没有变化。

DECLARE @T TABLE(CompanyID INT, DateInt INT, MyCol INT);
INSERT INTO @T(CompanyID , DateInt , MyCol) 
VALUES  (1, 20121015, 1), (1, 20121113, 1), (1, 20130108, 2), (1, 20130207, 2), (1, 20130409, 2), (1, 20130815, 1 ),
        (2, 20050611, 7), (2, 20080719, 7), (4, 20091114, 3), (4, 20091215, 3), (4, 20100304, 5), (4, 20110215, 5)

;WITH Stage1 AS
(
    SELECT   *
            ,UseThis    = IIF(LEAD(MyCol, 1, 0) OVER (PARTITION BY CompanyID ORDER BY DateInt) != MyCol, 1, 0)
            ,Change     = IIF(LAG(MyCol, 1, 0) OVER (PARTITION BY CompanyID ORDER BY DateInt) = 0 OR LAG(MyCol, 1, 0) OVER (PARTITION BY CompanyID ORDER BY DateInt) = MyCol, 0, 1)
    FROM @T
)
--  Find companies that have not changed
,Stage2 AS
(
    SELECT *
            ,Inert  = SUM(Change) OVER (PARTITION BY CompanyID)
    FROM Stage1
)
SELECT   CompanyID
        ,DateInt
        ,MyCol
FROM Stage2
WHERE UseThis = 1 
AND Inert != 0