好的......我不明白为什么这个查询需要这么长时间(MSSQL Server 2005):
[典型输出3K行,5.5分钟执行时间]
SELECT dbo.Point.PointDriverID, dbo.Point.AssetID, dbo.Point.PointID, dbo.Point.PointTypeID, dbo.Point.PointName, dbo.Point.ForeignID, dbo.Pointtype.TrendInterval, coalesce(dbo.Point.trendpts,5) AS TrendPts, LastTimeStamp = PointDTTM, LastValue=PointValue, Timezone
FROM dbo.Point
LEFT JOIN dbo.PointType ON dbo.PointType.PointTypeID = dbo.Point.PointTypeID
LEFT JOIN dbo.PointData ON dbo.Point.PointID = dbo.PointData.PointID
AND PointDTTM = (SELECT Max(PointDTTM) FROM dbo.PointData WHERE PointData.PointID = Point.PointID)
LEFT JOIN dbo.SiteAsset ON dbo.SiteAsset.AssetID = dbo.Point.AssetID
LEFT JOIN dbo.Site ON dbo.Site.SiteID = dbo.SiteAsset.SiteID
WHERE onlinetrended =1 and WantTrend=1
PointData是biggun,但我认为它的定义应该让我能够轻松地获取我想要的东西:
CREATE TABLE [dbo].[PointData](
[PointID] [int] NOT NULL,
[PointDTTM] [datetime] NOT NULL,
[PointValue] [real] NULL,
[DataQuality] [tinyint] NULL,
CONSTRAINT [PK_PointData_1] PRIMARY KEY CLUSTERED
(
[PointID] ASC,
[PointDTTM] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_PointDataDesc] ON [dbo].[PointData]
(
[PointID] ASC,
[PointDTTM] DESC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
PointData是550M行,而Point(PointID的源)只有28K行。 我尝试制作索引视图,但我无法弄清楚如何以兼容的方式获取最后时间戳/值(没有Max,没有子查询,没有CTE)。
每小时运行两次,运行后我将更多数据放入我选择的3K PointID中。我想过将LastTime / LastValue表直接创建到Point中,但这似乎是错误的方法。
我错过了什么,还是应该重建一些东西? (我也是DBA,但我知道非常关于A'ing DB!)
答案 0 :(得分:1)
对于初学者,请尝试删除相关子查询。我还用表别名重写了它,使其更容易阅读(并减少输入!)。
尝试这样的事情:
SELECT p.PointDriverID, p.AssetID, p.PointID,
p.PointTypeID, p.PointName, p.ForeignID,
pt.TrendInterval, coalesce(p.trendpts,5) AS TrendPts,
LastTimeStamp = PointDTTM, LastValue=PointValue, Timezone
FROM dbo.Point p
LEFT JOIN dbo.PointType pt ON pt.PointTypeID = p.PointTypeID
LEFT JOIN dbo.PointData pd ON p.PointID = pd.PointID
INNER JOIN (
SELECT PointID, Max(PointDTTM) as MaxPointDTTM
FROM dbo.PointData
group by PointID
) pdm on pd.PointID = pdm.PointID and pd.PointDTTM = pdm.MaxPointDTTM
LEFT JOIN dbo.SiteAsset sa ON sa.AssetID = p.AssetID
LEFT JOIN dbo.Site ON s.SiteID = sa.SiteID
WHERE onlinetrended =1 and WantTrend=1
答案 1 :(得分:0)
我不是SQLServer的人,但我知道查询带有子查询同一个表的where子句的表是个坏消息,特别是对于如此大的记录集。从概念上讲,您正在仔细查看该子选择表,以获取每行数据。如果我记得正确的SQLServer允许你将变量存储在内存中,如果没有,那很好,你可以用表格来完成。
创建服务器变量(或表,它只需要一列,并且只有一行)。现在创建一个触发器,以便每当在PointData中插入或更新记录时检查变量(或该记录)。如果插入或更新的记录的日期时间大于变量,请更新变量。现在,您可以在查询中使用该变量或加入该表。应该减少查询的大量时间。
答案 2 :(得分:0)
PointData.PointDTTM上的非聚集索引可能会有所不同 - 您要求SQL为每个PointID从此字段中查找MAX值,而SQL只有聚集索引来执行此操作。明显优于表扫描,但仍然不是最佳。
此外,您加入的子查询每行都运行一次 - 您可以使用以下修改消除它:
;WITH PointDataDTTMMax (PointID, PointDTTM)
AS (SELECT PointID, MAX(PointDTTM)
FROM PointData
GROUP BY PointID)
SELECT ...
这将使用CTE(公用表表达式),并且只执行一次聚合查询。
答案 3 :(得分:0)
在非聚集索引中包含PointValue,使其成为覆盖(甚至在执行计划中使用?)或更改聚簇索引以生成PointDTTM DESC。
还要删除其他答案中提到的相关子查询(取决于优化器是否已经处理好了)
答案 4 :(得分:0)
我首先要更换子查询 - 我没试过这个,希望没有拼写错误:
SELECT dbo.Point.PointDriverID, dbo.Point.AssetID, dbo.Point.PointID, dbo.Point.PointTypeID, dbo.Point.PointName, dbo.Point.ForeignID, dbo.Pointtype.TrendInterval, coalesce(dbo.Point.trendpts,5) AS TrendPts, LastTimeStamp = PointDTTM, LastValue=PointValue, Timezone
FROM dbo.Point
LEFT JOIN dbo.PointType ON dbo.PointType.PointTypeID = dbo.Point.PointTypeID
INNER JOIN (SELECT dbo.PointData.PointID, Max(dbo.PointData.PointDTTM) AS MaxDT
FROM dbo.PointData
INNER JOIN dbo.Point ON dbo.PointData.PointID = dbo.Point.PointID
WHERE onlinetrended =1 and WantTrend=1
GROUP BY dbo.PointData.PointID) f
ON dbo.Point.PointID = f.PointID
INNER JOIN dbo.PointData
ON f.PointID = dbo.PointData.PointID AND f.MaxDT = dbo.PointData.PointDTTM
LEFT JOIN dbo.SiteAsset ON dbo.SiteAsset.AssetID = dbo.Point.AssetID
LEFT JOIN dbo.Site ON dbo.Site.SiteID = dbo.SiteAsset.SiteID
然后我会检查你是否可以用内连接替换部分或全部左连接。每个Point都有PointType吗?如果是,请使用内部联接。每个点都至少有一个PointData吗?然后使用内部联接。对SiteAsset和Site执行相同的操作。
如果这还不够,请检查查询的执行计划:哪些步骤占用了大部分执行时间?找到大的并尝试优化它们。