一个稍微棘手的SQL问题(我们正在运行SQL Server 2008 R2)。
从日志表中我必须组合具有相同消息的连续记录,以计算组合消息的数量并删除组合消息。
为了使这更容易理解和可见,这里有一个小数据示例
ID DATE MSG COUNT
1 2013-08-17 mail NULL
2 2013-08-17 mail NULL
3 2013-08-17 www NULL
4 2013-08-18 www NULL
5 2013-08-18 www NULL
6 2013-08-18 www NULL
7 2013-08-18 mail NULL
8 2013-08-18 www NULL
9 2013-08-19 mail NULL
10 2013-08-19 mail NULL
11 2013-08-20 mail NULL
12 2013-08-20 mail NULL
13 2013-08-21 www NULL
14 2013-08-22 mail NULL
15 2013-08-22 mail NULL
16 2013-08-23 mail NULL
17 2013-08-23 mail NULL
18 2013-08-23 mail NULL
结果应如下所示
ID DATE MSG COUNT
1 2013-08-17 mail NULL
2 2013-08-17 mail NULL
3 2013-08-17 www NULL
6 2013-08-18 www 3
7 2013-08-18 mail 1
8 2013-08-18 www 1
12 2013-08-20 mail 4
13 2013-08-21 www 1
15 2013-08-22 mail 2
16 2013-08-23 mail NULL
17 2013-08-23 mail NULL
18 2013-08-23 mail NULL
所以,基本上查询应该
由于我不是SQL专家,我非常感谢任何帮助,建议或SQL查询
提前致谢...
答案 0 :(得分:1)
我的想法是用2个查询来做:
(i)第一个是仅计算和更新记录。
(ii)第二个是删除NULL
列上COUNT
值为COUNT
的日期范围的所有记录。
编辑:我执行了步骤(i),但我无法保留NULL
值COUNT
以删除要删除的值。它使用DELETE
更新所有行。现在你只需要UPDATE tab ta JOIN
(SELECT date, msg, COUNT(*) AS cnt FROM tab GROUP BY date, msg) tb
SET ta.count = tb.cnt
WHERE ta.date = tb.date AND ta.msg = tb.msg AND
ta.date BETWEEN
DATE('2013-08-18') AND DATE('2013-08-21');
正确的行。
步骤(i):
(适用于MySQL)
DATE
PS:我使用的UPDATE ta
SET ta.count = tb.cnt
FROM tab ta,
(SELECT date, msg, COUNT(*) AS cnt FROM tab GROUP BY date, msg) tb
WHERE ta.date = tb.date AND ta.msg = tb.msg AND ta.date
BETWEEN CAST('2013-08-18' AS DATE) AND CAST('2013-08-20' AS DATE);
语法适用于MySQL,您可能会将其改编为MS SQL Server。
(对于MS SQL Server)
{{1}}
答案 1 :(得分:1)
试试这个:
DROP TABLE #temp
GO
select
*
into #temp
from (
select '1' as id,'2013-08-17' as [date], 'mail' as msg,'NULL' as [count] union all
select '2','2013-08-17','mail','NULL' union all
select '3','2013-08-17','www','NULL' union all
select '4','2013-08-18','www','NULL' union all
select '5','2013-08-18','www','NULL' union all
select '6','2013-08-18','www','NULL' union all
select '7','2013-08-18','mail','NULL' union all
select '8','2013-08-18','www','NULL' union all
select '9','2013-08-19','mail','NULL' union all
select '10','2013-08-19','mail','NULL' union all
select '11','2013-08-20','mail','NULL' union all
select '12','2013-08-20','mail','NULL' union all
select '13','2013-08-21','www','NULL' union all
select '14','2013-08-22','mail','NULL' union all
select '15','2013-08-22','mail','NULL' union all
select '16','2013-08-23','mail','NULL' union all
select '17','2013-08-23','mail','NULL' union all
select '18','2013-08-23','mail','NULL'
) x
GO
select
t.*,
rwn
from #temp t
join (
select
id, [date], [msg], [rwn] = row_number() over(partition by [date], [msg] order by id )
from #temp
where 1=1
and [date] between '2013-08-18' and '2013-08-22'
) x
on t.id=x.id
order by
t.date, t.msg
只需将其修改为UPDATE,然后删除rwn> 1
的所有行编辑: 您的数据类型可能是文本,因此您可以对错误进行排序/比较。你真的需要文字吗?它是一种大型对象数据类型(blob),可以存储几GB的文本。尝试将此更改为varchar(8000),或者如果这些确实是那么大的消息,那么varchar(max)也会这样做
答案 2 :(得分:1)
嗨,请尝试这个希望它可以帮助你,我理解的方式是你需要分组并删除重复并保留1。抱歉我的英文
DECLARE @Table_2 TABLE (ID INT, [DATE] date, MSG Varchar(50), [COUNT] int)
Declare @fromDate as date = '2013-08-18'
Declare @toDate as date = '2013-08-22'
INSERT INTO @Table_2 (ID, [DATE], MSG, [COUNT])
SELECT MAX(DISTINCT ID) AS ID, DATE, MSG, COUNT(DATE) AS COUNT
FROM dbo.Table_1
where [DATE] between @fromDate and @toDate
GROUP BY DATE, MSG
UPDATE Table_1
SET [COUNT] = T2.COUNT
FROM Table_1 AS T1 INNER JOIN
@Table_2 AS T2
ON T1.ID = T2.ID
WHERE T1.ID = T2.ID
DELETE T1
FROM Table_1 AS T1
FULL OUTER JOIN @Table_2 AS T2
ON T1.DATE = T2.DATE AND T1.MSG = T2.MSG
WHERE (T1.DATE = T2.DATE AND T1.MSG = T2.MSG) AND T1.ID != T2.ID