如何使用超过1500万行的表聚合来优化查询?

时间:2018-09-28 09:57:26

标签: sql sql-server tsql query-optimization

此查询的执行时间为3:29分钟。如何减少此查询的执行时间?

FactLoad表的行总数为-16 769 952。

FactLoadFin表的总行数为-12308763。

SELECT    
 wd.id_CompanyOperator AS IdCompanyOperator
, wd.StationFromCode
, wd.StationToCode
, SUM(wd.TariffInvRFPartialSum)/ MAX(wd.CarsCountByOp) AS TariffInvRFWeighted
, SUM(wd.TariffOwnRFPartialSum)/ MAX(wd.CarsCountByOp) AS TariffOwnRFWeighted
, SUM(wd.AmountIncVATPartialSum)/ MAX(wd.CarsCountByOp) AS AmountIncVATWeighted
, MAX(wd.AmountIncVAT) AS MaxAmountIncVAT
, MIN(wd.AmountIncVAT) AS MinAmountIncVAT
FROM      
(
    SELECT 
    AVG(fl.TariffInvRF) OVER (PARTITION BY fl.StationFromCode, 
                                        fl.StationToCode,
                                        fl.id_CompanyOperator,
                                        fl.NaklID)
                * COUNT(fl.ID) OVER (PARTITION BY fl.StationFromCode,
                                        fl.StationToCode,
                                        fl.id_CompanyOperator,
                                        fl.NaklID) AS TariffInvRFPartialSum
    , AVG(fl.TariffOwnRF) OVER (PARTITION BY fl.StationFromCode,
                                        fl.StationToCode,
                                        fl.id_CompanyOperator,
                                        fl.NaklID)
                * COUNT(fl.ID) OVER (PARTITION BY fl.StationFromCode,
                                    fl.StationToCode,
                                    fl.id_CompanyOperator,
                                    fl.NaklID) AS TariffOwnRFPartialSum
    , COUNT(fl.ID) OVER (PARTITION BY fl.StationFromCode,
                                fl.StationToCode,
                                fl.id_CompanyOperator) AS CarsCountByOp
    , AVG(flf.AmountIncVAT) OVER (PARTITION BY fl.StationFromCode,
                                        fl.StationToCode,
                                        fl.id_CompanyOperator,
                                        fl.NaklID) AS AmountIncVAT
    , AVG(flf.AmountIncVAT) OVER (PARTITION BY fl.StationFromCode,
                                        fl.StationToCode,
                                        fl.id_CompanyOperator,
                                        fl.NaklID)
                * COUNT(fl.ID) OVER (PARTITION BY fl.StationFromCode,
                                    fl.StationToCode,
                                    fl.id_CompanyOperator,
                                    fl.NaklID) AS AmountIncVATPartialSum
    , fl.StationFromCode
    , fl.StationToCode
    , fl.id_CompanyOperator
    , fl.NaklID
    FROM   dbo.FactLoad fl LEFT JOIN dbo.FactLoadFin flf 
        ON flf.id_FactLoad = fl.ID
           AND flf.id_Assortment IN (
               SELECT
               ID
               FROM  dbo.SP_Assortment WHERE Name IN ('Car Rent'))
    ) wd
 GROUP BY  wd.StationFromCode
    , wd.StationToCode
    , wd.id_CompanyOperator

The query plan could be seen here.

FactLoad的DDL为:

CREATE TABLE dbo.FactLoad
(
     ID                     BIGINT IDENTITY(1,1)    NOT NULL      
   , NaklID                 BIGINT                  NOT NULL
   , CarNumber              VARCHAR(10)             NOT NULL
   , OutDate                    DATETIME                NOT NULL
   , ArriveDate             DATETIME                NULL
   , DeliveryDate           DATETIME                NULL
   , RegisterDate           DATETIME                NULL      
   , StationFromCode            VARCHAR(6)              NOT NULL
   , StationToCode          VARCHAR(6)              NOT NULL
   , Distance               INT                     NULL
   , DistanceRF             INT                     NULL
   , ETSNG                  VARCHAR(6)              NOT NULL
   , id_RouteType           INT                     NULL
   , id_RouteView           INT                     NULL
   , CarQnt                 INT                     NOT NULL
   , CarCapacity                NUMERIC(7,3)            NULL
   , CarModel               VARCHAR(50)             NULL
   , CarType                    INT                     NOT NULL
   , CargoWeight                NUMERIC(7,3)            NULL
   , OwnerOKPO              VARCHAR(12)             NULL
   , TenantOKPO                 VARCHAR(12)             NULL
   , SpecialNotes           VARCHAR(25)             NULL
   , TariffInv              NUMERIC(18,3)           NULL
   , TariffInvRF                NUMERIC(18,3)           NULL
   , TariffOwn              NUMERIC(18,3)           NULL
   , TariffOwnRF                NUMERIC(18,3)           NULL
   , Operator1              VARCHAR(255)            NULL
   , Operator2              VARCHAR(255)            NULL
   , isDeleted              TINYINT                 NULL DEFAULT(0)
   , DeleteDate             DATETIME2               NULL
   , NeedUpdate_Data            TINYINT                 NOT NULL DEFAULT(0)
   , NeedUpdate_Tariff      TINYINT                 NOT NULL DEFAULT(1)
   , NeedTransffer          TINYINT                 NOT NULL DEFAULT(1)
   , id_FactLoad                BIGINT                  NULL
   , DepartureDate          DATETIME                NULL
   , DepartureDateSuek      DATETIME                NULL
   , id_CompanyOperator     INT                     NOT NULL DEFAULT(-1)
   , StationBorderInCode        VARCHAR(6)              NULL
   , StationBorderInDate        DATE                    NULL
   , TariffVAT              NUMERIC(21,3)           NULL
   , CarSign                    VARCHAR(50)             NULL
   , CONSTRAINT [PK_FactLoad] PRIMARY KEY CLUSTERED (ID) WITH( 
   STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
   ALLOW_PAGE_LOCKS = ON)
)

CREATE NONCLUSTERED INDEX [IX_FactLoad_NaklIDLast] ON dbo.FactLoad(NaklID_Last)

CREATE NONCLUSTERED INDEX [IX_FactLoad_NaklnumberCarnumberOutdate] ON dbo.FactLoad(NaklNumber, CarNumber, OutDate)

CREATE NONCLUSTERED INDEX [IX_FactLoad_ForTariff] ON dbo.FactLoad(OutDate, ETSNG, NeedUpdate_Tariff, CarType, Distance, DistanceRF, StationToCode, StationFromCode, CargoWeight, id_RouteView, CarQnt, CarCapacity, CarModel)

CREATE NONCLUSTERED INDEX [IX_FactLoad_ForTariff_2] ON dbo.FactLoad (NeedUpdate_Tariff, CarType) INCLUDE (Distance, DistanceRF, ETSNG)

ALTER INDEX [IX_FactLoad_ForTariff_2] ON [dbo].[FactLoad] DISABLE



 IF EXISTS(SELECT * FROM sys.indexes WHERE object_id = 
     object_id('dbo.FactLoad') AND NAME ='IX_FactLoad_ForObjectsID4Transfer')
 DROP INDEX [IX_FactLoad_ForObjectsID4Transfer] ON dbo.FactLoad;

 CREATE NONCLUSTERED INDEX [IX_FactLoad_ForObjectsID4Transfer] ON 
      dbo.FactLoad (NaklID, OutDate, NeedUpdate_Tariff, ModifyDate)


 IF EXISTS(SELECT * FROM sys.indexes WHERE object_id = object_id('dbo.FactLoad') AND NAME ='IX_FactLoad_ForObjectsID4Transfer_2')
 DROP INDEX [IX_FactLoad_ForObjectsID4Transfer_2] ON dbo.FactLoad;

 CREATE NONCLUSTERED INDEX [IX_FactLoad_ForObjectsID4Transfer_2] ON 
     dbo.FactLoad (NaklID_Last, OutDate, NeedUpdate_Tariff, ModifyDate)



  CREATE NONCLUSTERED INDEX [IX_FactLoad_ForResetStatus] ON dbo.FactLoad 
      (ModifyDate) INCLUDE (NaklID, NaklID_Last)


  CREATE NONCLUSTERED INDEX [IX_FactLoad_ForIsDeleted] ON dbo.FactLoad 
      (OutDate,isDeleted,SourceFlag) INCLUDE ([ID],[NaklID],[NaklNumber], 
      [CarNumber],[ArriveDate],[DeliveryDate],[StationFromCode], 
      [StationToCode],[DistanceRF],[ETSNG],[id_RouteType],[CarQnt], 
      [CarCapacity],[CarModel],[CargoWeight],[OwnerOKPO],[TenantOKPO], 
      [TariffInvRF],[TariffOwnRF],[SenderOKPO],[RecipOKPO], 
      [PayerOKPO],[StationBorderCode],[DateExpire],[CarWeight],[DueWeight], 
      [Weight1001],[DailyOrderNum],[DailyOrderPos],[DistanceDue], 
      [id_SolidThread],[DepartureDate],[id_CompanySender],[id_CompanyRecip], 
      [id_Incoterms],[isSuek],[id_CompanyPayer],[OwnerID],[TenantID], 
      [id_CompanyOwner],[id_CompanyTenant],[ArriveDateReceiver], 
      [id_CompanyBeneficiary],[id_CompanyOperator],[SAP_CargoCode], 
      [SAP_SenderPE],[id_VAT],[LastNaklNumber],[LastCarNumber],[isDos], 
      [PriorETSNG])


  CREATE NONCLUSTERED INDEX [IX_FactLoad_LastnaklnumberLastcarnumberOutdate] 
  ON dbo.FactLoad (LastNaklNumber, LastCarNumber, OutDate)


  CREATE NONCLUSTERED INDEX [IX_FactLoad_CarnumberNaklnumberOutdate] ON 
     [dbo].[FactLoad] ([CarNumber], [NaklNumber], [OutDate]) ON [PRIMARY]


  CREATE NONCLUSTERED INDEX [IX_FactLoad_StationFromTo] ON [dbo].[FactLoad] 
      ([StationFromCode], [StationToCode])

FactLoadFin的DDL为:

CREATE TABLE dbo.FactLoadFin
   (
    ID                              BIGINT              NOT NULL IDENTITY(1,1)
  , id_FactLoad                     BIGINT              NOT NULL
  , id_Assortment                   INT                 NOT NULL
  , id_AccrualType                  TINYINT             NOT NULL
  , id_Company                      INT                 NOT NULL
  , SpecItem                        INT                 NULL
  , id_VAT                          INT                 NULL
  , Amount                          NUMERIC(18,2)       NULL
  , AmountVAT                       NUMERIC(18,2)       NULL
  , AmountIncVAT                    NUMERIC(18,2)       NULL
  , ActItem                         BIGINT              NULL
  , Scope                           VARCHAR(5)          NULL    
   ) 

 drop index if exists FactLoadFin.IX_FactLoadFin_ActitemIddocumentheader
 CREATE NONCLUSTERED INDEX IX_FactLoadFin_ActitemIddocumentheader ON 
     dbo.FactLoadFin(ActItem, id_DocumentHeader)

 drop index if exists FactLoadFin.IX_FactLoadFin_Iddocumentheader
 CREATE NONCLUSTERED INDEX IX_FactLoadFin_Iddocumentheader ON 
 dbo.FactLoadFin(id_DocumentHeader)

 drop index if exists FactLoadFin.IX_FactLoadFin_Assortment
 create index IX_FactLoadFin_Assortment on FactLoadFin
    (id_Assortment, id_Contract, id_Company, id_FactLoad);
 update statistics FactLoadFin with fullscan;

 CREATE NONCLUSTERED INDEX [IX_FactLoadFin_idContractDoc_SpecItem] ON 
    dbo.FactLoadFin(id_ContractDoc, SpecItem) INCLUDE (id_DocumentHeader, 
    Amount, AmountVAT, AmountIncVAT)

 DROP INDEX IX_FactLoadFin_ScopeAmount ON dbo.FactLoadFin
 CREATE NONCLUSTERED INDEX [IX_FactLoadFin_ScopeAmount] ON 
     dbo.FactLoadFin(Scope, Amount) INCLUDE(id_FactLoad, id_Contract, 
     id_Company, ActItem, AmountVAT)

我减少了CREATE TABLE中的列,以缩短它,因为减少的列不参与要优化的查询。

任何帮助将不胜感激。我认为,提出问题并得到社区的建议要比一直无所事事更好。预先感谢。

1 个答案:

答案 0 :(得分:1)

查看您的计划,似乎大多数资源已被sort操作消耗: enter image description here

enter image description here

我可能会创建以下索引以便对这些数据进行预排序:

CREATE NONCLUSTERED INDEX IX_FactLoad_StationFromCode_StationToCode_idCompanyOperator_NaklID
ON dbo.FactLoad (StationFromCode, StationToCode, id_CompanyOperator, NaklID)
INCLUDE (TariffInvRF, TariffOwnRF, ID);

这应该会改善它,但可能不会创造奇迹。如果可以的话,可以进一步研究并进行进一步研究。

更新

尝试使用ColumnStore索引。

具有ColumnStore索引的测量结果。

  

ColumnStore索引,其中包括实际执行计划:01:02   分钟,不包括实际执行计划在内的时间是:01:01分钟。

ColumnStore索引的代码为

CREATE NONCLUSTERED COLUMNSTORE INDEX 
   [csi_FactLoad_StationFromCode_StationToCode_idCompanyOperator_NaklID] ON 
   [dbo].[FactLoad] ( [NaklID], [StationFromCode], [StationToCode], 
   [id_CompanyOperator], [ID], [TariffInvRF], [TariffOwnRF] )WITH 
   (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0) ON [PRIMARY]