处理庞大的表格-1亿多行

时间:2018-08-24 12:17:38

标签: sql sql-server sql-server-2016

我有大约1亿行的表,而且它只是在变大,因为经常查询表,所以我必须提出一些解决方案来对此进行优化。

首先这里是模型

CREATE TABLE [dbo].[TreningExercises](
[TreningExerciseId] [uniqueidentifier] NOT NULL,
[NumberOfRepsForExercise] [int] NOT NULL,
[CycleNumber] [int] NOT NULL,
[TreningId] [uniqueidentifier] NOT NULL,
[ExerciseId] [int] NOT NULL,
[RoutineExerciseId] [uniqueidentifier] NULL)

这是Trening表:

CREATE TABLE [dbo].[Trenings](
[TreningId] [uniqueidentifier] NOT NULL,
[DateTimeWhenTreningCreated] [datetime] NOT NULL,
[Score] [int] NOT NULL,
[NumberOfFinishedCycles] [int] NOT NULL,
[PercentageOfCompleteness] [int] NOT NULL,
[IsFake] [bit] NOT NULL,
[IsPrivate] [bit] NOT NULL,
[UserId] [nvarchar](128) NOT NULL,
[AllRoutinesId] [bigint] NOT NULL,
[Name] [nvarchar](max) NULL,
)

索引(不包括聚簇的PK):

TreningExcercises:

  1. TreningId(也为FK)
  2. ExerciseId(也为FK)

调整:

  1. UserId(也为FK)
  2. AllRoutinesId(也是FK)
  3. 得分
  4. DateTimeWhenTreningCreated(按DateTimeWhenTreningCreated DESC排序)

这是最常执行的查询的示例:

DECLARE @userId VARCHAR(40)
,@exerciseId INT;

SELECT TOP (1) R.[TreningExerciseId] AS [TreningExerciseId]
    ,R.[NumberOfRepsForExercise] AS [NumberOfRepsForExercise]
    ,R.[TreningId] AS [TreningId]
    ,R.[ExerciseId] AS [ExerciseId]
    ,R.[RoutineExerciseId] AS [RoutineExerciseId]
    ,R.[DateTimeWhenTreningCreated] AS [DateTimeWhenTreningCreated]
FROM (
    SELECT TE.[TreningExerciseId] AS [TreningExerciseId]
        ,TE.[NumberOfRepsForExercise] AS [NumberOfRepsForExercise]
        ,TE.[TreningId] AS [TreningId]
        ,TE.[ExerciseId] AS [ExerciseId]
        ,TE.[RoutineExerciseId] AS [RoutineExerciseId]
        ,T.[DateTimeWhenTreningCreated] AS [DateTimeWhenTreningCreated]
    FROM [dbo].[TreningExercises] AS TE
    INNER JOIN [dbo].[Trenings] AS T ON TE.[TreningId] = T.[TreningId]
    WHERE (T.[UserId] = @userId)
        AND (TE.[ExerciseId] = @exerciseId)
    ) AS R
ORDER BY R.[DateTimeWhenTreningCreated] DESC

执行计划: link

如果它是由ORM(实体框架)生成的,则有些抱歉,请接受我的道歉,我只是对其进行了一点编辑。

根据Azure的SQL Analytics工具,此查询对我的数据库影响最大,尽管通常不需要花费太长时间执行,但由于该问题,数据库I / O有时会出现峰值。 >

为简化起见,还涉及一些业务逻辑:99%的时间我需要不到一年的数据。 关于查询和表大小,我最好的选择是什么?

我对查询的想法:

  1. 创建索引视图或
  2. 将日期和UserId字段添加到TreningExerciseId表中,或
  3. 一些我没想到的选择:)

关于表的大小,要么:

  1. 分区表(可能按日期)或
  2. 将大部分数据(或全部数据)移至某个NoSQL键值存储区或
  3. 一些我没想到的选择:)

您对这些问题有什么想法,我应该如何解决?

1 个答案:

答案 0 :(得分:2)

如果将以下列添加到索引“ ix_TreninID”:

  • NoOfRepsForExecercise
  • ExerciseID
  • RoutineExerciseID

这将使索引成为“覆盖索引”,并消除了占用95%计划的查找需求。

先行一步,然后发回。