优化CLUSTERED INDEX以与JOIN一起使用

时间:2011-08-12 11:10:25

标签: sql sql-server-2005 tsql indexing clustered-index

表格optin_channel_1(每个“频道”都有一个专用的表格)

CREATE TABLE [dbo].[optin_channel_1](
    [key_id] [bigint] NOT NULL,
    [valid_to] [datetime] NOT NULL,
    [valid_from] [datetime] NOT NULL,
    [key_type_id] [int] NOT NULL,
    [optin_flag] [tinyint] NOT NULL,
    [source_proc_id] [int] NOT NULL,
    [date_inserted] [datetime] NOT NULL
) ON [PRIMARY]

CREATE CLUSTERED INDEX [ix_id] ON [dbo].[optin_channel_1] 
(
    [key_type_id] ASC,
    [key_id] ASC,
    [valid_to] ASC,
    [valid_from] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

profile_conns

CREATE TABLE [dbo].[profile_conns](
    [profile_key_id] [bigint] NOT NULL,
    [valid_to] [datetime] NOT NULL,
    [valid_from] [datetime] NOT NULL,
    [conn_key_id] [bigint] NOT NULL,
    [conn_key_type_id] [int] NOT NULL,
    [conn_type_id] [int] NOT NULL,
    [source_proc_id] [int] NOT NULL,
    [date_inserted] [datetime] NOT NULL
) ON [PRIMARY]

CREATE CLUSTERED INDEX [ix_id] ON [dbo].[profile_conns] 
(
    [profile_key_id] ASC,
    [conn_key_type_id] ASC,
    [conn_key_id] ASC,
    [valid_to] ASC,
    [valid_from] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

lu_channel_conns

CREATE TABLE [dbo].[lu_channel_conns](
    [channel_id] [int] NOT NULL,
    [conn_type_id] [int] NOT NULL,
 CONSTRAINT [PK_lu_channel_conns] PRIMARY KEY CLUSTERED 
(
    [channel_id] ASC,
    [conn_type_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

lu_conn_type

CREATE TABLE [dbo].[lu_conn_type](
    [conn_type_id] [int] NOT NULL,
    [default_key_type_id] [int] NOT NULL,
    [master_key_type_id] [int] NOT NULL,
    [date_inserted] [datetime] NOT NULL,
 CONSTRAINT [PK_lu_conns] PRIMARY KEY CLUSTERED 
(
    [conn_type_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

查看v_source_proc_id_by_group_id

SELECT DISTINCT x.source_proc_id, x.source_proc_group_id
FROM lu_source_proc x INNER JOIN lu_source_proc_group y ON x.source_proc_group_id = y.group_id

将会执行一个动态SQL语句:

SET @sql_str='SELECT @ret=MAX(o.optin_flag)
    FROM optin_channel_'+CAST(@channel_id AS NVARCHAR(100))+' o
    INNER HASH JOIN dbo.v_source_proc_id_by_group_id y ON o.source_proc_id=y.source_proc_id AND y.source_proc_group_id=@source_proc_group_id
    INNER HASH JOIN profile_conns z ON z.profile_key_id=cast(@profile_key_id AS NVARCHAR(100)) AND z.conn_key_type_id=o.key_type_id AND z.conn_key_id=o.[key_id] AND z.valid_to=''01.01.3000''
    INNER HASH JOIN lu_channel_conns x ON x.channel_id=@channel_id AND z.conn_type_id=x.conn_type_id
    INNER HASH JOIN lu_conn_type ct ON ct.conn_type_id=x.conn_type_id AND ct.default_key_type_id=o.key_type_id'
SET @param='@channel_id INT, @profile_key_id INT, @source_proc_group_id INT, @ret NVARCHAR(400) OUTPUT'
EXEC sp_executesql @sql_str,@param,@channel_id,@profile_key_id,@source_proc_group_id,@ret OUTPUT

即。这给了:

SELECT @ret=MAX(o.optin_flag) AS optin_flag
FROM optin_channel_1 o
INNER HASH JOIN dbo.v_source_proc_id_by_group_id y 
    ON o.source_proc_id=y.source_proc_id 
    AND y.source_proc_group_id=5
INNER HASH JOIN profile_conns z 
    ON z.profile_key_id=1 
    AND z.conn_key_type_id=o.key_type_id 
    AND z.conn_key_id=o.[key_id] 
    AND z.valid_to='01.01.3000'
INNER HASH JOIN lu_channel_conns x 
    ON x.channel_id=1 
    AND z.conn_type_id=x.conn_type_id
INNER HASH JOIN lu_conn_type ct 
    ON ct.conn_type_id=x.conn_type_id 
    AND ct.default_key_type_id=o.key_type_id

这些表用于optin数据库。 optin_flag可以是0或1.对于具有optin_flag的用户,对于给定optin_channel_1的{​​{1}},我希望从channel_id=1得到1为profile_key_id=1,当optin被属于source_proc_group_id=5的进程插入数据库时​​。我希望这足以理解正在发生的事情。

这是使用CLUSTERED INDEX的最佳方式吗?或者最好从profile_key_id上的索引中删除profile_conns并将z.profile_key_id=1放在WHERE子句中?

可能有更好的方法来优化此选择(无法更改数据库模式,只更改索引和修改语句)。

1 个答案:

答案 0 :(得分:3)

在不知道表的大小和存储在其中的数据类型的情况下,很难衡量。

假设optin_channel_1有大量数据且profile_cons有大量数据,我会尝试以下方法:

  • optin_channel_1(key_id)或key_type_id上​​的聚簇索引,具体取决于哪个字段具有最不同的值。 (因为你没有覆盖索引)
  • profile_conns(cons_key_id)或cons_key_type_id上​​的聚簇索引,具体取决于您在optin_channel_1中选择的内容
  • 等...

基本上,如果你的表profile_conns表没有太多数据,我会把聚集索引放在最碎片的“过滤器”字段上(我怀疑是profile_key_id)。如果表有很多数据,我会以散列/合并连接为目标,并将聚簇索引与optin_channel_1表的聚簇索引相匹配。

我也会这样重写查询:

SELECT @ret = MAX(o.optin_flag) AS optin_flag
  FROM optin_channel_1 o
  JOIN dbo.v_source_proc_id_by_group_id y 
    ON o.source_proc_id = y.source_proc_id  
  JOIN profile_conns z 
    ON z.conn_key_type_id = o.key_type_id 
   AND z.conn_key_id = o.[key_id] 
  JOIN lu_channel_conns x 
    ON z.conn_type_id = x.conn_type_id
  JOIN lu_conn_type ct 
    ON ct.conn_type_id = x.conn_type_id 
   AND ct.default_key_type_id=o.key_type_id 
 WHERE y.source_proc_group_id = 5
   AND z.profile_key_id = 1 
   AND x.channel_id = 1 
   AND z.valid_to = '01.01.3000'

查询改变了这种方式,因为:

  • 将过滤条件放在where子句中,显示了用于哈希/合并连接的相关字段
  • 提交联接提示很少是个好主意。要查询调控器以确定最佳查询计划是非常困难的。糟糕的计划通常表明您的索引/统计信息存在问题。

总结如下:

  • 小桌子加入大桌子==>去嵌套循环&将您的聚集索引聚焦在小表格中的“过滤器”字段上。大表中的连接字段。
  • 大桌加入大桌=> go for hash / merge join并将聚集索引放在两边的匹配字段上
  • 多字段索引通常只是一个好主意,当它们“覆盖”时,这意味着您查询的所有字段都包含在索引中。 (或包含在include()子句中)