我有一个似乎很容易解决的问题,但现在我发现它很麻烦。
简化中 - 我需要找到一种方法来识别由另一列定义的组中的唯一行集。在基本示例中,源表只包含两列:
routeID nodeID nodeName
1 1 a
1 2 b
2 1 a
2 2 b
3 1 a
3 2 b
4 1 a
4 2 c
5 1 a
5 2 c
6 1 a
6 2 b
6 3 d
7 1 a
7 2 b
7 3 d
因此,routeID列指的是定义路由的节点集。
我需要做的是以某种方式对路由进行分组,以便一个routeID只有一个唯一的节点序列。
在我的实际情况中,我尝试使用窗口函数添加有助于识别节点序列的列,但我仍然不知道如何获取这些唯一序列和组路由。
作为最后的效果,我想获得唯一的路线 - 例如路线1,2和3汇总到一条路线。
你知道如何帮助我吗?
编辑:
我想与示例中的一个表连接的另一个表可能看起来像这样:
journeyID nodeID nodeName routeID
1 1 a 1
1 2 b 1
2 1 a 1
2 2 b 1
3 1 a 4
3 2 c 4
...........................
...........................
答案 0 :(得分:0)
你可以尝试这个想法:
DECLARE @DataSource TABLE
(
[routeID] TINYINT
,[nodeID] TINYINT
,[nodeName] CHAR(1)
);
INSERT INTO @DataSource ([routeID], [nodeID], [nodeName])
VALUES ('1', '1', 'a')
,('1', '2', 'b')
,('2', '1', 'a')
,('2', '2', 'b')
,('3', '1', 'a')
,('3', '2', 'b')
,('4', '1', 'a')
,('4', '2', 'c')
,('5', '1', 'a')
,('5', '2', 'c')
,('6', '1', 'a')
,('6', '2', 'b')
,('6', '3', 'd')
,('7', '1', 'a')
,('7', '2', 'b')
,('7', '3', 'd');
SELECT DS.[routeID]
,nodes.[value]
,ROW_NUMBER() OVER (PARTITION BY nodes.[value] ORDER BY [routeID]) AS [rowID]
FROM
(
-- getting unique route ids
SELECT DISTINCT [routeID]
FROM @DataSource DS
) DS ([routeID])
CROSS APPLY
(
-- for each route id creating CSV list with its node ids
SELECT STUFF
(
(
SELECT ',' + [nodeName]
FROM @DataSource DSI
WHERE DSI.[routeID] = DS.[routeID]
ORDER BY [nodeID]
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
,1
,1
,''
)
) nodes ([value]);
代码将为您提供此输出:
因此,您只需按rowID = 1
进行过滤即可。当然,您可以根据需要更改代码,以满足您的商务标准(例如,显示没有第一个具有相同节点但最后一个节点的路径ID)。
此外,ROW_NUMBER
函数不能直接在WHERE
子句中使用,因此您需要在过滤之前包装代码:
WITH DataSource AS
(
SELECT DS.[routeID]
,nodes.[value]
,ROW_NUMBER() OVER (PARTITION BY nodes.[value] ORDER BY [routeID]) AS [rowID]
FROM
(
-- getting unique route ids
SELECT DISTINCT [routeID]
FROM @DataSource DS
) DS ([routeID])
CROSS APPLY
(
-- for each route id creating CSV list with its node ids
SELECT STUFF
(
(
SELECT ',' + [nodeName]
FROM @DataSource DSI
WHERE DSI.[routeID] = DS.[routeID]
ORDER BY [nodeID]
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
,1
,1
,''
)
) nodes ([value])
)
SELECT DS2.*
FROM DataSource DS1
INNER JOIN @DataSource DS2
ON DS1.[routeID] = DS2.[routeID]
WHERE DS1.[rowID] = 1;
答案 1 :(得分:0)
好的,让我们使用一些递归为每个routeID创建一个完整的节点列表
首先让我们填充源表和旅程故事
-- your source
declare @r as table (routeID int, nodeID int, nodeName char(1))
-- your other table
declare @j as table (journeyID int, nodeID int, nodeName char(1), routeID int)
-- temp results table
declare @routes as table (routeID int primary key, nodeNames varchar(1000))
;with
s as (
select *
from (
values
(1, 1, 'a'),
(1, 2, 'b'),
(2, 1, 'a'),
(2, 2, 'b'),
(3, 1, 'a'),
(3, 2, 'b'),
(4, 1, 'a'),
(4, 2, 'c'),
(5, 1, 'a'),
(5, 2, 'c'),
(6, 1, 'a'),
(6, 2, 'b'),
(6, 3, 'd'),
(7, 1, 'a'),
(7, 2, 'b'),
(7, 3, 'd')
) s (routeID, nodeID, nodeName)
)
insert into @r
select *
from s
;with
s as (
select *
from (
values
(1, 1, 'a', 1),
(1, 2, 'b', 1),
(2, 1, 'a', 1),
(2, 2, 'b', 1),
(3, 1, 'a', 4),
(3, 2, 'c', 4)
) s (journeyID, routeID, nodeID, nodeName)
)
insert into @j
select *
from s
现在让我们的exctract路线:
;with
d as (
select *, row_number() over (partition by r.routeID order by r.nodeID desc) n2
from @r r
),
r as (
select d.*, cast(nodeName as varchar(1000)) Names, cast(0 as bigint) i2
from d
where nodeId=1
union all
select d.*, cast(r.names + ',' + d.nodeName as varchar(1000)), r.n2
from d
join r on r.routeID = d.routeID and r.nodeId=d.nodeId-1
)
insert into @routes
select routeID, Names
from r
where n2=1
表@routes将是这样的:
routeID nodeNames
1 'a,b'
2 'a,b'
3 'a,b'
4 'a,c'
5 'a,c'
6 'a,b,d'
7 'a,b,d'
现在是最终输出:
-- the unique routes
select MIN(r.routeID) routeID, nodeNames
from @routes r
group by nodeNames
-- the unique journyes
select MIN(journeyID) journeyID, r.nodeNames
from @j j
inner join @routes r on j.routeID = r.routeID
group by nodeNames
输出:
routeID nodeNames
1 'a,b'
4 'a,c'
6 'a,b,d'
和
journeyID nodeNames
1 'a,b'
3 'a,c'