根据列数据合并sql表的记录

时间:2012-03-27 08:38:46

标签: c# sql-server database tsql stored-procedures

我在t_resourcetable中有一些脏资源使用记录,看起来像这样

resNo   subres    startdate                        enddate
1        2        2012-01-02 22:03:00.000          2012-01-03 00:00:00.000
1        2        2012-01-03 00:00:00.000          2012-01-04 00:00:00.000
1        2        2012-01-04 00:00:00.000          2012-01-04 16:23:00.000
1        3        2012-01-06 16:23:00.000          2012-01-06 22:23:00.000
2        2        2012-01-04 05:23:00.000          2012-01-06 16:23:00.000

我需要以这种方式合并那些脏行

resNo   subres    startdate                        enddate
1        2        2012-01-02 22:03:00.000          2012-01-04 16:23:00.000
1        3        2012-01-06 16:23:00.000          2012-01-06 22:23:00.000
2        2        2012-01-04 05:23:00.000          2012-01-06 16:23:00.000

这应该更新到同一个表,我有超过40k行所以不能使用光标请帮助我通过一些优化的sql语句清理这种数据。

提供temptable和group的解决方案不会遇到类似的场景。 我正在寻找没有基于光标的解决方案来解决这个问题

resNo   subres    startdate                        enddate
1        2        2012-01-02 22:03:00.000          2012-01-03 00:00:00.000
1        2        2012-01-03 00:00:00.000          2012-01-04 00:00:00.000
1        2        2012-01-04 00:00:00.000          2012-01-04 16:23:00.000
1        2        2012-01-14 10:09:00.000          2012-01-15 00:00:00.000
1        2        2012-01-15 00:00:00.000          2012-01-16 00:00:00.000
1        2        2012-01-16 00:00:00.000          2012-01-16 03:00:00.000
1        3        2012-01-06 16:23:00.000          2012-01-06 22:23:00.000
2        2        2012-01-04 05:23:00.000          2012-01-06 16:23:00.000

我需要以这种方式合并那些脏行

resNo   subres    startdate                        enddate
1        2        2012-01-02 22:03:00.000          2012-01-04 16:23:00.000
1        2        2012-01-14 10:09:00.000          2012-01-16 03:00:00.000
1        3        2012-01-06 16:23:00.000          2012-01-06 22:23:00.000
2        2        2012-01-04 05:23:00.000          2012-01-06 16:23:00.000

plesae让我摆脱了这个肮脏的数据问题

3 个答案:

答案 0 :(得分:1)

您需要通过resNo和subRes对数据进行分组,如下所示:

select resNo, subRes, min(startdate), max(enddate)
from  t_resourcetable
group by resNo, subRes

并将结果插入临时表。

然后你可以截断t_resourcetable并将reult从temp tample插入其中

答案 1 :(得分:0)

第一步是创建备份:

select * 
into t_resourcetable_backup20120327
from t_resourcetable

然后为resNo和subRes分组的第一个记录更新enddate:

update t_resourcetable
set enddate = (select max (enddate) 
                 from t_resourcetable t1 
                where t1.resNo = t_resourcetable.resNo
                  and t1.subRes = t_resourcetable.subRes)
 where not exists (select null
                 from t_resourcetable t1
                where t1.resNo = t_resourcetable.resNo
                  and t1.subRes = t_resourcetable.subRes
                  and t1.startdate < t_resourcetable.startdate)

最后删除额外的记录:

delete t_resourcetable
 where exists (select null
                 from t_resourcetable t1
                where t1.resNo = t_resourcetable.resNo
                  and t1.subRes = t_resourcetable.subRes
                  and t1.startdate < t_resourcetable.startdate)

如果resNo和subRes的唯一组合有重复的startdate,则会留下重复项。你还应该检查enddates是否总是有相应的startdate,因为你会失去差距 - 但这可能只是你想要的东西。

除了创建备份之外,您可以在事务中包装更新/删除,在删除和回滚后执行选择,然后检查Excel中的数据,如果一切正常,请重复该事务,但这次提交。

更新:此查询可识别差距。如果您使用的是Sql Server 2000,请将CTE转换为派生表。首先返回没有前任的资源列表,最后返回对后继者的相同。两者都算差距。然后列表由resNo,subRes和gap number连接。

;with first as (
    select resNo, subres, startdate,
  row_number() over (partition by resNo, subres order by startdate) rowNumber
      from t_resourcetable
     where not exists (select null
                       from t_resourcetable t1
                      where t1.resNo = t_resourcetable.resNo
                        and t1.subres = t_resourcetable.subres
                        and t1.enddate = t_resourcetable.startdate)
),
last as (
  select resNo, subres, enddate,
  row_number () over (partition by resNo, subres order by enddate) rowNumber
  from t_resourcetable
     where not exists (select null
                       from t_resourcetable t1
                      where t1.resNo = t_resourcetable.resNo
                        and t1.subres = t_resourcetable.subres
                        and t1.startdate = t_resourcetable.enddate)
)
select first.resno, first.subres, first.startdate, last.enddate
from first 
  inner join last
  on first.resNo = last.resNo
    and first.subres = last.subres
    and first.rowNumber = last.rowNumber

答案 2 :(得分:0)

你可以尝试一下吗?

SELECT resno,
       subres,
       startdate,
       MIN(enddate) AS enddate
FROM   (SELECT t1.resno,
               t1.subres,
               t1.startdate,
               t2.enddate
        FROM   t_resourcetable t1,
               t_resourcetable t2
        WHERE  t1.enddate <= t2.enddate
               AND NOT EXISTS (SELECT *
                               FROM   t_resourcetable t3
                               WHERE  ( t1.resno = t3.resno
                                        AND t1.subres = t3.subres
                                        AND t1.startdate > t3.startdate
                                        AND t1.startdate <= t3.enddate )
                                       OR ( t2.resno = t3.resno
                                            AND t2.subres = t3.subres
                                            AND t2.enddate >= t3.startdate
                                            AND t2.enddate < t3.enddate )))t
GROUP  BY resno,
          subres,
          startdate 

图像就像

TimeLine