日期之间的SQL查询和基于员工名称的查询

时间:2017-09-18 22:46:09

标签: sql-server database tsql azure

我有一个SQL Azure数据库,其中有两个主要表,我试图通过视图加入。我有它工作,但执行时间超过2分钟。

以下是我正在处理的主要表格和专栏:

TransactionsTable:

PostedDate          | EmployeeFirstName | EmployeeLastName | DollarsCollected | UserName
---------------------------------------------------------------------------------------------
09/08/2017 09:05 am | 'John'            | 'Smith'          |            42.25 | 'john.smith'
09/08/2017 09:07 am | 'Jane'            | 'Jones'          |            58.50 | 'jane.jones'
09/08/2017 09:15 am | 'Tom'             | 'Holland'        |            62.75 | 'tom.holland' 
09/08/2017 09:17 am | 'John'            | 'Smith'          |            48.50 | 'john.smith'    
09/08/2017 09:19 am | 'Jane'            | 'Jones'          |            32.25 | 'jane.jones'

CustomerHistory

CustomerID | StartDate           | Duration | UserName      | TransactionType
-----------------------------------------------------------------------------
         1 | 09/08/2017 09:02 am |      600 | 'john.smit'h  | 'PropertyTax'
         2 | 09/08/2017 09:03 am |      500 | 'tom.holland' | 'TagRenewal'
         3 | 09/08/2017 09:04 am |      450 | 'jane.jones'  | 'PropertyTax'
         4 | 09/08/2017 09:12 am |      700 | 'john.smith'  | 'TagRenewal'
         5 | 09/08/2017 09:16 am |      300 | 'jane.jones'  | 'TagRenewal'

所以,这里的交易是 - 一个员工一次只能有一个客户。如果我们知道交易发布的时间并且我们知道员工发布了什么,那么我们应该能够使用StartDateStartDate + Duration作为&#将这些信息连接到CustomerHistory表34;伞"总交易量。考虑StartDate + Duration等于EndDate。所以,这是我试图运行以实现此目的的查询:

SELECT
    *
FROM
    CustomerHistory
    JOIN TransactionsTable ON
        CustomerHistory.UserName = TransactionsTable.UserName
        AND
        TransactionsTable.PostedDate >= CustomerHistory.StartDate
        AND
        TransactionsTable.PostedDate <= DATEADD( ss, CustomerHistory.StartDate, CustomerHistory.Duration )

作为参考,我在UserName字段和日期字段上都有索引。我只想说我在这里过度简化我的表,因为我希望加入的每个表中都有更多的数据列。我已在SQL Server中运行执行计划,它告诉我哈希匹配将占用我执行时间的大约38%,并且事务表上的表扫描将占用42%。我对SQL很不错,但是从来没有深入研究过像我在这里处理的资源那样的资源密集,而且它在我的服务器上试图以这种方式运行它会产生相当大的负担。

有人可以帮忙吗?

1 个答案:

答案 0 :(得分:0)

加速是很棘手的,因为你的查询是非SARGable的。添加一个持久的计算列我认为应该适合您的查询。您可能希望阅读它,因为 可以减慢 DML(插入,更新和删除)。

我的表格创建脚本供参考(SQL Fiddle http://sqlfiddle.com/#!6/f1a39/8):

Create Table dbo.TransactionsTable
(
  PostedDate Datetime,
  EmployeeFirstName Varchar(100),
  EmployeeLastName Varchar(100),
  DollarsCollected Money,
  UserName Varchar(100)
);

Create Table dbo.CustomerHistory
(
  CustomerID Int,
  StartDate DateTime,
  Duration Int,
  UserName Varchar(100),
  TransactionType Varchar(100)
);


Insert Into dbo.TransactionsTable
Values 
('09/08/2017 09:05 am','John','Smith',42.25 , 'john.smith'),
('09/08/2017 09:07 am','Jane','Jones',58.50 , 'jane.jones'),
('09/08/2017 09:15 am','Tom','Holland',62.75 , 'tom.holland'), 
('09/08/2017 09:17 am','John','Smith',48.50 , 'john.smith'),    
('09/08/2017 09:19 am','Jane','Jones',32.25 , 'jane.jones');
GO

Insert Into dbo.CustomerHistory
  Values (1,'09/08/2017 09:02 am',600,'john.smith' ,'PropertyTax'),
         (2,'09/08/2017 09:03 am',500,'tom.holland','TagRenewal'),
         (3,'09/08/2017 09:04 am',450,'jane.jones','PropertyTax'),
         (4,'09/08/2017 09:12 am',700,'john.smith','TagRenewal'),
         (5,'09/08/2017 09:16 am',300,'jane.jones','TagRenewal');
Go

确保您拥有此索引。

CREATE NONCLUSTERED INDEX ix_test1
  ON TransactionsTable(UserName,PostedDate) 
  Include (EmployeeFirstName,EmployeeLastName,DollarsCollected);
GO

对于客户历史记录,您可以添加一个持久列吗?然后索引列

Alter Table CustomerHistory
Add EndDate As (DateAdd(ss,Duration,StartDate)) Persisted;
GO
CREATE NONCLUSTERED INDEX ix_test2
  ON CustomerHistory(UserName,StartDate,EndDate)
  Include(CustomerID,TransactionType);
GO

这是查询

SELECT *
FROM CustomerHistory AS A
INNER JOIN TransactionsTable AS B
  ON A.UserName = B.UserName
  AND B.PostedDate Between A.StartDate And A.EndDate;