我有一些TSQL代码可以从组织良好的关系表中生成一个非规范化的平面文件。代码很快就完成了,而且数据并不是很大,所以任何建议都有可能有所帮助。我不必担心性能,因为这个过程只打算每月运行一次。在这方面,我有一些摆动的空间。
例如,源数据的布局如下:一个人(表1)可能有很多事件(表2)。每个事件都可以有许多代码(表3)。每个代码都有一个有序的序列。因此,在将其展平后,提取文件中的一行可能如下所示:
Name IncidentId Code1 Code2 Code3 Code4
Sue Ellen Crandell 1991 abc1 def1 xyz0 888
这些非规范化的有序代码列可能会超过50个。问题是,如果其中一个有序代码列的值在排除列表中,则有一个新要求,那么下面的有序代码列值应该向前移动一个位置。这意味着,如果def1
位于排除列表中,则该行应如下所示:
Name IncidentId Code1 Code2 Code3 Code4
Sue Ellen Crandell 1991 abc1 xyz0 888 <empty string>
在获取其他关系数据并将结果导出到文件之前,我使用动态T-SQL将这些有序代码值反规范化为临时表。由于不想搞乱动态T-SQL,以及在过程的这一部分能够使用条件来移动列的可能限制,我认为放置排除列表评估的最简单的地方是在有序代码值之后进入临时表。
如果我的临时表看起来像上面的第一个数据行,我怎么能
排除列表只是少数静态值,我可以将其转储到临时表或与IN
运算符一起使用。我猜测可能需要CTE,但递归逻辑对我来说并不清楚。
答案 0 :(得分:1)
首先创建一个CTE,使表格不显示,以便每个代码都在一个单独的行上:
with cte(Name, IncidentId, CodeName, Code)
as(
select Name, IncidentId, CodeName, Code
from Incident i
unpivot(Code for CodeName in (Code1, Code2, Code3, Code4)) unpvt
)
现在,您在CTE上对其自身进行外部联接,过滤掉排除的代码。这为每个Name-Incident-Code元组提供了一行,但是在排除代码的行中有空值(您需要空行来维护正确的代码计数)。
Select *, t1.Name, t1.IncidentId, isnull(t2.Code, '') Code,
ROW_NUMBER() over(partition by t1.Name, t1.IncidentId order by isnull(t2.CodeName, 'zzz')) CodeNumber
from cte t1
left outer join cte t2 on t1.Name = t2.Name and
t1.IncidentId = t2.IncidentId and
t1.Code = t2.Code and
not exists(select 1 from Exclude e where e.Code = t2.Code)
这里的ROW_NUMBER()将创建新的CodeNumber。 order byisnull(t2.CodeNumber, 'zzz'))
将空行推送到末尾,以便具有有效代码的行首先编号(因为“zzz”大于“Code-whatever - ”)。
现在您只需要回转前一个查询,以便代码再次成为列:
select Name, IncidentId, [1] Code1, [2] Code2, [3] as Code3, [4] as Code4
from
(
Select t1.Name, t1.IncidentId, isnull(t2.Code, '') Code, ROW_NUMBER() over(partition by t1.Name, t1.IncidentId order by isnull(t2.CodeName, 'zzz')) CodeNumber
from cte t1
left outer join cte t2 on t1.Name = t2.Name and t1.IncidentId = t2.IncidentId and t1.Code = t2.Code and not exists(select 1 from Exclude e where e.Code = t2.Code)
) x
pivot(max(Code) for CodeNumber in ([1], [2], [3], [4])
) as pvt
上面的代码存在一些问题。首先,当我使用ROW_NUMBER()创建CodeNumber时,我按CodeName排序。这会在9个代码列之后分解,因为它们不再正确排序(它们按字母顺序而不是数字排序)。所以我需要在CTE中输出代码编号,以便稍后使用它进行排序:
with cte(Name, IncidentId, CodeName, CodeNumber, Code)
as(
select Name, IncidentId, CodeName, convert(int, SUBSTRING(CodeName, 5, len(CodeName))), Code
from Incident i
unpivot(Code for CodeName in (Code1, Code2, Code3, Code4, Code5, Code6, Code7, Code8, Code9, Code10)) unpvt
)
现在查询的其余部分如下所示:
select Name, IncidentId, [1] Code1, [2] Code2, [3] as Code3, [4] as Code4, [5] as Code5, [6] as Code6, [7] as Code7, [8] as Code8, [9] as Code9, [10] as Code10
from
(
Select t1.Name, t1.IncidentId, isnull(t2.Code, '') Code, ROW_NUMBER() over(partition by t1.Name, t1.IncidentId order by isnull(t2.CodeNumber, 999)) NewCodeNumber
from cte t1
left outer join cte t2 on t1.Name = t2.Name and t1.IncidentId = t2.IncidentId and t1.Code = t2.Code and not exists(select 1 from Exclude e where e.Code = t2.Code)
) x
pivot(max(Code) for NewCodeNumber in ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10])
) as pvt
请注意,由于我现在在CTE中有一个名为CodeNumber的列,因此我调用新生成的数字“NewCodeNumber”。另外,我按t2.CodeNumber
而不是t1.Code
订购。
更新了SQL Fiddle。
关于评论中的问题,您实际上是在询问是否对多列进行拆分,这不像单个列的取消一样简单。实现它的一种方法是单独解开代码和编码:
with cteCode(Name, IncidentId, CodeName, CodeNumber, Code)
as(
select Name, IncidentId, CodeName, convert(int, SUBSTRING(CodeName, 5, len(CodeName))), Code
from Incident i
unpivot(Code for CodeName in (Code1, Code2, Code3, Code4, Code5, Code6, Code7, Code8, Code9, Code10)) unpvt
), cteCodeDate(Name, IncidentId, CodeName, CodeNumber, CodeDate)
as(
select Name, IncidentId, CodeName, convert(int, SUBSTRING(CodeName, 9, len(CodeName))), CodeDate
from Incident i
unpivot(CodeDate for CodeName in (CodeDate1, CodeDate2, CodeDate3, CodeDate4, CodeDate5, CodeDate6, CodeDate7, CodeDate8, CodeDate9, CodeDate10)) unpvt
)
然后再加入他们:
Select t1.Name, t1.IncidentId, isnull(t2.Code, '') Code, ROW_NUMBER() over(partition by t1.Name, t1.IncidentId order by isnull(t2.CodeNumber, 999)) NewCodeNumber, t3.CodeDate
from cteCode t1
join cteCodeDate t3 on t3.Name = t1.Name and t3.IncidentId = t1.IncidentId and t3.CodeNumber = t1.CodeNumber
left outer join cteCode t2 on t1.Name = t2.Name and t1.IncidentId = t2.IncidentId and t1.Code = t2.Code and not exists(select 1 from Exclude e where e.Code = t2.Code)
透视多列并不像单列那么容易,所以我采用了不同的路径来获得最终结果:
select Name, IncidentId,
MAX(case when newCodeNumber = 1 then Code end) Code1,
MAX(case when newCodeNumber = 1 then CodeDate end) CodeDate1,
MAX(case when newCodeNumber = 2 then Code end) Code2,
MAX(case when newCodeNumber = 2 then CodeDate end) CodeDate2,
MAX(case when newCodeNumber = 3 then Code end) Code3,
MAX(case when newCodeNumber = 3 then CodeDate end) CodeDate3,
MAX(case when newCodeNumber = 4 then Code end) Code4,
MAX(case when newCodeNumber = 4 then CodeDate end) CodeDate4,
MAX(case when newCodeNumber = 5 then Code end) Code5,
MAX(case when newCodeNumber = 5 then CodeDate end) CodeDate5,
MAX(case when newCodeNumber = 6 then Code end) Code6,
MAX(case when newCodeNumber = 6 then CodeDate end) CodeDate6,
MAX(case when newCodeNumber = 7 then Code end) Code7,
MAX(case when newCodeNumber = 7 then CodeDate end) CodeDate7,
MAX(case when newCodeNumber = 8 then Code end) Code8,
MAX(case when newCodeNumber = 8 then CodeDate end) CodeDate8,
MAX(case when newCodeNumber = 9 then Code end) Code9,
MAX(case when newCodeNumber = 9 then CodeDate end) CodeDate9,
MAX(case when newCodeNumber = 10 then Code end) Code10,
MAX(case when newCodeNumber = 10 then CodeDate end) CodeDate10
from
(
Select t1.Name, t1.IncidentId, isnull(t2.Code, '') Code, ROW_NUMBER() over(partition by t1.Name, t1.IncidentId order by isnull(t2.CodeNumber, 999)) NewCodeNumber, t3.CodeDate
from cteCode t1
join cteCodeDate t3 on t3.Name = t1.Name and t3.IncidentId = t1.IncidentId and t3.CodeNumber = t1.CodeNumber
left outer join cteCode t2 on t1.Name = t2.Name and t1.IncidentId = t2.IncidentId and t1.Code = t2.Code and not exists(select 1 from Exclude e where e.Code = t2.Code)
) x
group by Name, IncidentId
答案 1 :(得分:0)
评论太长了。
最好使用动态SQL处理。将列中的内容移动到列以处理排除项是最麻烦的。它最终将成为一些变体:
if code1 is not excluded then code1
else if code2 is not excluded then code2
else if code3 is not excluded then code3
else code4 is not excluded then code4 as code1
if code1 is not excluded
if code2 is not excluded then code2
else if code3 is not excluded then code3
else if code4 is not excluded then code4
and so on, and so on and so on
相反,您可能有一个地方可以向动态SQL添加这样的内容:
where not exists (select 1 from ExcludedCodes ec where ec.code <> the.code)
你将在转向之前消除它们。