我正在尝试获取报告的当前状态。
我有几个表存储基本信息:
[SDate - DateTime] [Value] [ReportServer] [reportid]
2010-11-16 10:10:00 1 Server1 1
2010-11-16 10:11:00 0 Server2 1
2010-11-16 10:12:00 1 Server1 1
2010-11-16 10:13:00 1 Server2 1
第二个表存储通知数据,这样每次报告无法生成并运行此查询时,它都会告诉我最后的状态
[alertdate - DateTime] [Value] [reportid]
2010-11-16 10:10:00 1 1
2010-11-16 10:11:00 0 1
2010-11-16 10:12:00 1 1
2010-11-16 10:13:00 1 1
如果我运行我的查询并且上次报告运行失败,我将需要查询来返回处于失败状态的reportid。
第二个查询与各种类型相反,我需要知道上次失败但在失败后成功运行的报告。
这是我现有的部分工作查询
SELECT *
FROM reports
WHERE reportid IN (
SELECT epr.reportid
FROM (SELECT t.*,
lag(t.isup)
OVER (ORDER BY (reportdate)) AS prev_value
FROM reportresults t
WHERE iserrored = 1) t
INNER JOIN reportresults epr
ON epr.reportid = t.reportid
WHERE t.server != epr.server
AND prev_value != 0
AND t.reportid NOT IN (SELECT reportid
FROM reportresults
WHERE cast(t.reportdate AS datetime)
< cast(reportdate AS datetime)
AND reportid = t.reportid
AND iserrored = 0)
GROUP BY epr.endpointid
)
有人能告诉我如何获取错误的报告以及恢复的错误报告吗?
我的查询是否完全符合我的要求?
编辑
这是表结构:
CREATE TABLE [Reports](
[ReportID] [uniqueidentifier] NOT NULL,
[CustomerID] [int] NOT NULL,
[ReportTypeID] [int] NOT NULL
)
CREATE TABLE [checker].[EndpointResults](
[ReportResultID] [uniqueidentifier] NOT NULL,
[ReportID] [uniqueidentifier] NOT NULL,
[ReportServerID] [nvarchar](50) NOT NULL,
[CheckDate] [datetimeoffset](3) NOT NULL,
[ReportTypeID] [int] NOT NULL,
[Iserrored] [bit] NOT NULL,
[Message] [nvarchar](50) NOT NULL
)
CREATE TABLE [dbo].[ReportAlerts](
[AlertID] [int] IDENTITY(1,1) NOT NULL,
[AlertDate] [datetimeoffset](3) NOT NULL,
[issent] [bit] NOT NULL CONSTRAINT [DF_Alert_issent] DEFAULT ((0)),
[ReportServerID] [nvarchar](50) NULL,
[ReportID] [uniqueidentifier] NULL,
[iserrored] [bit] NOT NULL CONSTRAINT [DF_Alert_iserrored] DEFAULT ((0))
)
进一步解释:
TABLE [Reports]
[ReportID] [uniqueidentifier] NOT NULL,
[CustomerID] [int] NOT NULL,
[ReportTypeID] [int] NOT NULL
[reportid] [customerid] [reporttypeid]
------------------------------------ ---------------- ----------------
8D14EB9C-9C1E-4BBC-A3DE-15202072B1A2 1 0
FF302416-899B-432A-AA8B-E3207CF3C24F 1 0
F31F2C20-7182-45C6-93EE-FB332740B620 1 0
TABLE [checker].[EndpointResults]
[ReportResultID] [uniqueidentifier] NOT NULL,
[ReportID] [uniqueidentifier] NOT NULL,
[ReportServerID] [nvarchar](50) NOT NULL,
[CheckDate] [datetimeoffset](3) NOT NULL,
[ReportTypeID] [int] NOT NULL,
[Iserrored] [bit] NOT NULL,
[Message] [nvarchar](50) NOT NULL
ReportResultID ReportID ReportServerID CheckDate IsErrored
------------------------------------ ------------------------------------ ----------------------------------- ---------------------------------- ----------
5D78DA02-6D45-42D5-846E-BD17ADA21E9B F31F2C20-7182-45C6-93EE-FB332740B620 Server1 2016-02-04 20:18:39.459 -06:00 1
ABE92139-3D9D-4C84-96E3-28BBC40AF720 FF302416-899B-432A-AA8B-E3207CF3C24F Server2 2016-02-04 20:18:34.990 -06:00 0
DC56CFE5-53E5-4CE6-95F4-3816FBC4915A FF302416-899B-432A-AA8B-E3207CF3C24F Server1 2016-02-04 20:18:34.957 -06:00 0
EA9D6F69-09F9-4BEA-9858-BD10B058CC72 8D14EB9C-9C1E-4BBC-A3DE-15202072B1A2 Server1 2016-02-04 20:18:34.945 -06:00 0
D45C6316-17B2-462B-8BF9-4B012DBF9D4C 8D14EB9C-9C1E-4BBC-A3DE-15202072B1A2 Server2 2016-02-04 20:18:34.624 -06:00 0
231CC017-53AC-4F15-B620-B4FB8B0EE6C3 F31F2C20-7182-45C6-93EE-FB332740B620 Server2 2016-02-04 20:18:34.607 -06:00 1
7582273C-8AA5-4D42-9987-0116868269B8 F31F2C20-7182-45C6-93EE-FB332740B620 Server1 2016-02-04 20:18:29.461 -06:00 1
C006F277-63A3-4A31-965C-012624322C44 FF302416-899B-432A-AA8B-E3207CF3C24F Server1 2016-02-04 20:18:24.945 -06:00 0
DF1FE3C1-59DE-47FE-AE3E-DE11B83AFA74 8D14EB9C-9C1E-4BBC-A3DE-15202072B1A2 Server1 2016-02-04 20:18:24.932 -06:00 0
F05F4102-DFB3-4E98-B4E3-DB94FDD3959A F31F2C20-7182-45C6-93EE-FB332740B620 Server2 2016-02-04 20:18:24.647 -06:00 0
从上面你可以看到我有两台运行相同报告的服务器
如果我要查询上述10条记录,它会告诉我报告F31F2C20-7182-45C6-93EE-FB332740B620
处于错误状态且没有服务器
自2016-02-04 20:18:24.647 -06:00
以来已成功运行它,因此查询只返回F31F2C20-7182-45C6-93EE-FB332740B620
,因为它是目前唯一出错的报告。
下一个查询woudld加入警报表并告诉我已通知管理员报告处于错误状态,因此请将顶级表与警报表连接,其中上面的ReportID等于来自警报的reportid以及是否没有记录最后一次失败的creport结果后的警报,这个查询告诉我,我需要插入一个新警报并发送它
然后最后一个假设上面10个表中的最新条目显示IsErrored = 0
所以在2016-02-04 20:18:24.647 -06:00
,报告开始失败
它在2016-02-04 20:18:29.461 -06:00
和2016-02-04 20:18:34.607 -06:00
再次失败,下一次成功运行是2016-02-04 20:18:39.459 -06:00
因此查询只需要返回警报表中不包含记录的reportid F31F2C20-7182-45C6-93EE-FB332740B620
。
答案 0 :(得分:0)
编辑:r.e。可能很多服务器,理解。
好的那么......关于这个用于识别出错的reportID(例如,有2个isErrored值为1)
获取每个reportID + serverID对的最近一次运行尝试...
select a.reportID as reportID
, a.reportServerID as serverID
, a.checkDate as mostRecentTime
, a.isErrored as errFlag
from checker.EndpointResults a
where not exists (
select 1 from checker.EndpointResults b
where b.reportID = a.reportID
and b.reportServerID = a.reportServerID
-- This next and-condition makes sure that whatever
-- record wechoose for a given ReprtID + ServerID
-- above is the most recent record.
and b.checkDate > a.checkDate )
因此,如果这有效(让我知道),那么我们就可以计算每个失败的数量 报告目前有。
with MostRecentRuns( reportID, serverID, mostRecentTime, errFlag ) as (
-- Start by filtering out only the last runs for our reports on all servers.
-- Note this will not detect reports that NEVER run, or just ran ok "one time"
-- (which could be a problem if it last ran like over a year ago.)
select a.reportID as reportID
, a.reportServerID as serverID
, a.checkDate as mostRecentTime
, a.isErrored as errFlag
from checker.EndpointResults a
where not exists (
select 1 from checker.EndpointResults b
where b.reportID = a.reportID
and b.reportServerID = a.reportServerID
and b.checkDate > a.checkDate )
)
, FailCountsByReport ( reportID, errFlagSum ) as (
select a.reportID
, sum(a.errSum) as errFlagSum
from MostRecentRuns
group by a.reportID
)
select *
from FailCountsByReport
where errFlagSum >= 2
这部分可能具有历史意义,可以确定&#34;错误&#34;
现在说明&amp;下面的分析可能有用,但SQL片段不是
(所以请不要尝试运行它们。)
所以...我认为我们越来越近了。 :-) 感谢您提供更多详细信息,示例数据有所帮助。
让ID#&s缩小一点,结果显示此数据样本中的最后4位数字是不同的,这样可以更容易地谈论&#34; B620&#34; (与写作&#34; F31F2C20-7182-45C6-93EE-FB332740B620&#34;一直)。
如果我们订购 ReportID 和 CheckDate ,它看起来像这样......
+---------+---------+-----------+--------------+---------+
| Report | Report | Reprot | Check | Is |
| ResultID| ID | ServerID | Date <1> | Errored |
+---------+---------+-----------+--------------+---------+
| 959A | B620 | Server2 | 20:18:24.647 | 0 |
| 69B8 | B620 | Server1 | 20:18:29.461 | 1 |
| E6C3 | B620 | Server2 | 20:18:34.607 | 1 |
| 1E9B | B620 | Server1 | 20:18:39.459 | 1 |
+---------+---------+-----------+--------------+---------+
| F720 | C24F | Server2 | 20:18:34.990 | 0 |
| 915A | C24F | Server1 | 20:18:34.957 | 0 |
| 2C44 | C24F | Server1 | 20:18:24.945 | 0 |
+---------+---------+-----------+--------------+---------+
| CC72 | B1A2 | Server1 | 20:18:34.945 | 0 |
| 9D4C | B1A2 | Server2 | 20:18:34.624 | 0 |
| FA74 | B1A2 | Server1 | 20:18:24.932 | 0 |
+---------+---------+-----------+--------------+---------+
<1> For CheckDate only HH.MM.SS.MMM shown,
prefix with "2016-02-04 " for complete DATETIME
我认为你对&#34;错误&#34;的定义仍然不完整。
让我举一个例子来说明:这些报告中的哪一个是错误的&#34; ?
在这些简化的场景中,我将时间缩短到几分钟。秒。
+------+------+-------+---+
|RepID | SrvID| Time |Err|
+------+------+-------+---+
| A001 | 1 | 20:00 | 0 | Report ID A001 is obviously fine,
| " | 2 | 20:05 | 0 | no errors at all.
| " | 1 | 20:10 | 0 |
| " | 2 | 20:15 | 0 |
+------+------+-------+---+
| B002 | 1 | 20:00 | 0 | B002 is obviously a problem,
| " | 2 | 20:05 | 0 | all recent runs on each server
| " | 1 | 20:10 | 1 | failed.
| " | 2 | 20:15 | 1 |
+------+------+-------+---+
| C003 | 1 | 20:00 | 0 | C003 is less clear.
| " | 2 | 20:05 | 0 | Latest run on Server #1 was ok.
| " | 1 | 20:10 | 0 | Latest run on Server #2 failed.
| " | 2 | 20:15 | 1 | So is C003 a problem? It was sort of ok.
+------+------+-------+---+
| D004 | 1 | 20:00 | 1 | D004 is less clear.
| " | 2 | 20:05 | 1 | Latest run on Server #1 failed,
| " | 1 | 20:10 | 1 | but latest run on Server #2 was ok.
| " | 2 | 20:15 | 0 | So should D004 be a problem?
+------+------+-------+---+
错误=全部失败?
你想要&#34;错误&#34;表示所有服务器最近运行报告的尝试都失败了?
如果是这样,只有B002会出现错误&#34;在上面的例子中。
错误=任何失败?
或者应该&#34;错误&#34;意味着任何服务器的最新尝试都有1次失败?
如果是,那么B002,C003和D004将指示&#34;错误&#34;。
错误=最近的失败?
在此变体中,只有B002和C003表示&#34;错误&#34;。
让我知道&#34;错误&#34;的定义最适合您的情况。
目前,请从此处忽略,等待上述问题的解决方案r.e.准确定义&#34;错误&#34;。
编辑:请先重新尝试重试第二个SQL示例(示例#3尚未准备就绪,如果第一次和第二次工作相当好,将进行调整)。
编辑:感谢您添加表定义和列名。
好的....这里是带有一些空格填充的表定义(使我更容易阅读列名):
TABLE [Reports]
[ReportID] [uniqueidentifier] NOT NULL,
[CustomerID] [int] NOT NULL,
[ReportTypeID] [int] NOT NULL
TABLE [checker].[EndpointResults]
[ReportResultID] [uniqueidentifier] NOT NULL,
[ReportID] [uniqueidentifier] NOT NULL,
[ReportServerID] [nvarchar](50) NOT NULL,
[CheckDate] [datetimeoffset](3) NOT NULL,
[ReportTypeID] [int] NOT NULL,
[Iserrored] [bit] NOT NULL,
[Message] [nvarchar](50) NOT NULL
TABLE [dbo].[ReportAlerts]
[AlertID] [int] IDENTITY(1,1) NOT NULL,
[AlertDate] [datetimeoffset](3) NOT NULL,
[issent] [bit] NOT NULL CONSTRAINT [DF_Alert_issent] DEFAULT ((0)),
[ReportServerID] [nvarchar](50) NULL,
[ReportID] [uniqueidentifier] NULL,
[iserrored] [bit] NOT NULL CONSTRAINT [DF_Alert_iserrored] DEFAULT ((0))
下面显示的SQL示例是否实际运行?
如果是的话,他们是否会为你找到有用的方向?
你的SQL似乎比它需要的更难。
让我们总结一下您可以提出的基本问题:
以下问题列表是否已完成(或者可能已完整到足以让您开始使用)?
Which reports have never run?
Which reports ran ok with zero errors?
Which reports ran ok but have 1+ errors? (which I think
is what you were describing as your "second query
does the opposite of sorts...")
Which reports never ran ok (e.g. only 1+ errors)?
added: Which reports have errors but no Alerts (yet).
因此。让我们从计算报告的执行方式开始: (注意 - SQL未经过测试,几乎肯定包含拼写错误)
编辑,修复了使用case语句的reportID计数器。
编辑,使用修订表&amp;列名。
问题:此时担心警报是否重要?
或者标记哪些报告有任何错误(忽略有多少警报)就足够了?
我没有足够的上下文来判断您是否要计算EndpointResults或Alerts(或两者)。
如果您需要统计报告+警报,那么您将要加入ReportAlerts而不是EndpointResults,如图所示。
select a.reportID
, sum( case when b.reportID is null then 0 else 1 end ) as runCnt
, sum( coalesce( b.Iserrored, 0 ) ) as errCnt
from Reports a
left join checker.EndpointResults b
on b.ReportID = a.ReportID
group by a.reportID
order by a.reportID
请注意,如果左连接在ReportResults中找不到相应的reportid(如果有任何报告从未尝试过运行),那么在sum()中放置一个coalesce会处理可能出现的任何空值。
对于 runCnt (总运行尝试次数),我们只计算我们在ReportResults中为给定的reportid查看的非空reportID的数量。
对于 errCnt (总#错误),我们只计算我们在ReportResults中为给定的reportid看到的非零IsErrored值。由于此处的值为0或1,因此总和就足够了。 (如果你有多个错误代码超过&#39; 1 =错误&#39;这会更复杂。)
with ERRCOUNTS ( reportid, runCnt, errCnt ) as (
select a.reportID
, sum( case when b.reportID is null then 0 else 1 end ) as runCnt
, sum( coalesce( b.Iserrored, 0 ) ) as errCnt
from Reports a
left join checker.EndpointResults b
on b.ReportID = a.ReportID
group by a.reportID
), ALERTCOUNTS ( reportid, runCnt, errCnt, alertCnt, alertErrCnt ) as (
select a.reportID
, a.runCnt
, a.errCnt
, sum( case when b.reportID is null then 0 else 1 end ) as alertCnt
, sum( coalesce( b.iserrored, 0 ) ) as alertErrCnt
from ERRCOUNTS a
left join dbo.ReportAlerts b
on b.ReportID = a.ReportID
)
select * from ALERTCOUNTS
order by reportid
如果这有意义(并且我的假设成立),那么我们可以用它来回答上述问题。考虑以下示例w / CASE过滤不同类型的报告&#34;状态类别&#34;从我们的基本runCnt&amp; errCnt查询......
编辑:使用修订的表名。
with REPCOUNTS ( reportid, runCnt, errCnt ) as (
select a.reportID
, sum( coalesce( b.reportID, 0 ) as runCnt
, sum( coalesce( b.Iserrored, 0 ) as errCnt
from Reports a
left join checker.EndpointResults b
on b.ReportID = a.ReportID
group by a.reportID
-- lets add a date range because you'll want to focus on
-- arbitrary time windows eventually instead everything
-- over the last 10 years or however long you keep history.
where datepart( YEAR, b.CheckDate ) = 2010
and datepart( MONTH, b.CheckDate ) = 11
)
select
case when a.runCnt = 0 then 'NEVER'
when a.errCnt = 0 then 'OK, 0 ERRS'
when a.errCnt < a.runCnt then 'OK, 1+ ERRS'
when a.errCnt = a.runCnt then 'TOTAL FAIL'
else '** Should never happen **'
end as RUN_STATUS
, a.reportID
, r.runCnt
, r.errCnt
from REPCOUNTS
我在做&#34;案例逻辑&#34;之间来回走动。作为一系列的工会,但案例逻辑似乎更清晰。
这会让你更接近你想去的地方吗?