查询和表优化

时间:2014-11-17 10:31:24

标签: sql sql-server

`

SELECT DISTINCT ECS.UserId,,PFR.EntityId,PFR.CreatedBy
FROM EventConsentStatus ECS 
INNER JOIN @Institutions I ON ECS.InstitutionId=I.InstitutionId             
LEFT JOIN ParentFormResponses PFR ON  PFR.EntityTypeId = 1 
          AND PFR.EntityId=@ActivityId AND ECS.EventId=PFR.EntityId                 
WHERE ECS.EventId = @ActivityId

上述查询中提到的名为ParentFormResponses的表格具有3个以上的记录。表没有任何标识列,而是基于唯一性在一些列的组上创建了聚簇主索引。但是,执行一个简单的选择语句,即select * from ParentFormResponses,还需要超过16分钟。

如果我从上面提到的select语句中删除ParentFormResponses表的列名,那么它在2 3秒内显示结果但是对于上面的查询,它花费了太多时间。

如果我在entityid和entitytypeid上创建非聚集索引,那么它也没有给出优化结果。

请建议我如何改进表结构和查询性能。

详细信息: 表结构:

`CREATE TABLE [dbo].[ParentFormResponses](
[EntityId] [int] NOT NULL,
[FormId] [int] NOT NULL,
[StudentId] [nvarchar](50) NOT NULL,
[CreatedBy] [nvarchar](50) NOT NULL,
[CreatedOn] [datetime] NOT NULL,
[EntityTypeId] [int] NOT NULL,
[FormVersion] [decimal](18, 1) NOT NULL,
[DigitallySigned] [bit] NULL,
[HasResponseChanged] [bit] NULL,

CONSTRAINT [PK_ParentFormResponses] PRIMARY KEY CLUSTERED (     [EntityId] ASC,     [FormId] ASC,     [StudentId] ASC,     [EntityTypeId] ASC,     [FormVersion] ASC )WITH(PAD_INDEX = OFF,STATISTICS_NORECOMPUTE = OFF,IGNORE_DUP_KEY = OFF,ALLOW_ROW_LOCKS = ON,ALLOW_PAGE_LOCKS = ON,FILLFACTOR = 90)ON [PRIMARY] )ON [PRIMARY]`

=> 我删除了主键索引,并在非聚集索引下创建 CREATE NONCLUSTERED INDEX [IX_ParentFormResponses_sakshi_EntityId] ON [dbo].[ParentFormResponses_sakshi] ( [EntityId] ASC, [EntityTypeId] ASC, [FormId] ASC, [FormVersion] ASC ) INCLUDE ( [StudentId], [CreatedBy]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO

=> 基于sp_spaceused的所有三个表的结果是 -

  • name- rows,reserved,data,index_size,unused
  • ParentFormResponses - 309961,64704 KB,63592 KB,936 KB,176 KB
  • ParentFormResponses_sakshi- 309893,117696 KB,60472 KB,56944 KB,280 KB
  • EventConsentStatus - 673796,380920 KB,109240 KB,271512 KB,168 KB
  • 注意:记录数不是30k但超过3 lacs。我错误地写错了。

    1 个答案:

    答案 0 :(得分:1)

    从效果角度来看,distinctleft join的组合是可疑的。首先,检查以下查询:

    SELECT ECS.UserId, ECS.EntityId
    FROM EventConsentStatus ECS INNER JOIN
         @Institutions I
         ON ECS.InstitutionId = I.InstitutionId             
    WHERE ECS.EventId = @ActivityId;
    

    您可以使用EventConsentStatus(EventId, UserId)上的索引对此进行优化。假设它具有良好的性能,它将产生正确的行集。如果没有,那么您可能会从@Institions获得重复项。如果是这种情况,那么考虑删除它(该表中没有使用任何列,它仅用于过滤)。或者:

    SELECT ECS.UserId, ECS.EntityId
    FROM EventConsentStatus ECS 
    WHERE ECS.EventId = @ActivityId AND
          EXISTS (SELECT 1 FROM  @Institutions I WHERE ECS.InstitutionId = I.InstitutionId)
    

    要获得正确的值,让我们使用outer apply获取其他数据:

    SELECT ECS.UserId, PFR.EntityId, PFR.CreatedBy
    FROM EventConsentStatus ECS INNER JOIN
         @Institutions I
         ON ECS.InstitutionId = I.InstitutionId OUTER APPLY
         (SELECT TOP 1 PFR.EntityId, PFR.CreatedBy
          FROM ParentFormResponses pfr
          WHERE ECS.EventId = PFR.EntityId  AND
                PFR.EntityTypeId = 1 AND
                PFR.EntityId = @ActivityId
         ) pfr
    WHERE ECS.EventId = @ActivityId;
    

    为此,您需要ParentFormResponses(EntityId, EntityTypeID, CreatedBy)上的索引。