如何使用TSQL识别记录中的记录模式序列?

时间:2015-07-29 14:43:12

标签: sql-server sql-server-2008 tsql common-table-expression

这对我来说是一项相当新的练习,但我需要找到一种方法来识别表格中的模式序列。 例如,假设我有一个类似于以下内容的简单表:

enter image description here

现在我想做的是识别并分组所有具有值为5,9和6的序列模式的记录,并在查询中显示它们。你会如何使用T-SQL完成这项任务?

结果应如下所示:

enter image description here

我已经找到了一些可能实现这一目标的潜在例子,但却无法找到真正有用的东西。

2 个答案:

答案 0 :(得分:8)

您可以使用包含在CTE中的以下查询,以便将序列号分配给序列中包含的值:

;WITH Seq AS (
    SELECT v, ROW_NUMBER() OVER(ORDER BY k) AS rn
    FROM (VALUES(1, 5), (2, 9), (3, 6)) x(k,v)
)

<强>输出:

v   rn
-------
5   1
9   2
6   3

使用上面的CTE,您可以识别孤岛,即包含整个序列的连续行的片段:

;WITH Seq AS (
    SELECT v, ROW_NUMBER() OVER(ORDER BY k) AS rn
    FROM (VALUES(1, 5), (2, 9), (3, 6)) x(k,v)
), Grp AS (
SELECT [Key], [Value], 
       ROW_NUMBER() OVER (ORDER BY [Key]) - rn AS grp            
FROM mytable AS m
LEFT JOIN Seq AS s ON m.Value = s.v
)
SELECT *
FROM Grp

<强>输出:

    Key Value   grp
   -----------------
    1   5       0
    2   9       0
    3   6       0
    6   5       3
    7   9       3
    8   6       3

grp字段可帮助您准确识别这些岛屿。

现在你需要做的就是过滤掉部分群体:

;WITH Seq AS (
    SELECT v, ROW_NUMBER() OVER(ORDER BY k) AS rn
    FROM (VALUES(1, 5), (2, 9), (3, 6)) x(k,v)
), Grp AS (
SELECT [Key], [Value], 
       ROW_NUMBER() OVER (ORDER BY [Key]) - rn AS grp            
FROM mytable AS m
LEFT JOIN Seq AS s ON m.Value = s.v
)
SELECT g1.[Key], g1.[Value]
FROM Grp AS g1
INNER JOIN (
   SELECT grp
   FROM Grp
   GROUP BY grp
   HAVING COUNT(*) = 3 ) AS g2
ON g1.grp = g2.grp

Demo here

注意:此答案的初始版本使用INNER JOINSeq。如果表格包含5, 42, 9, 6之类的值,则此功能无效,因为42会将INNER JOIN过滤掉,并且此序列被错误地识别为有效序列。感谢@HABO进行此编辑。

答案 1 :(得分:1)

不是很优化,但我认为是对方回答:

CREATE TABLE pattern (
    rowID INT IDENTITY(1,1) PRIMARY KEY, 
    rowValue INT NOT NULL
);

INSERT INTO pattern (rowValue) VALUES (5);
INSERT INTO pattern (rowValue) VALUES (9);
INSERT INTO pattern (rowValue) VALUES (6);

SELECT * FROM pattern;

SELECT Trg.* FROM Keys Trg 
INNER JOIN pattern Pt ON (Trg.fValue = Pt.rowValue) 
INNER JOIN (
    SELECT K.fKey - P.rowID AS X, COUNT(*) AS Xc FROM Keys K 
        LEFT JOIN pattern P ON (K.fValue = P.rowValue) 
    WHERE 
        (P.rowID IS NOT NULL) 
    GROUP BY K.fKey - P.rowID 
    HAVING COUNT(*) = (SELECT COUNT(*) FROM pattern)
) Z ON (Trg.fKey - Pt.rowID = Z.X);

我使用表格将其连接到主表格。我计算Key和模式Key之间的差异,我只显示差异匹配的行(以及模式表中差异匹配行的行数)。