SQL Server - 在非索引字段

时间:2015-06-15 17:37:55

标签: sql-server join query-performance

背景:

我们的一个客户拥有一个系统,他们可以在他们的代理商和他们的客户之间录制电话对话,他们与他们签订了各种合同。记录存储在服务器上,其位置保存在数据库的“记录”表中。然后代理商可以"附加"记录到合同,在ContractRecordings表中创建一个条目。我需要创建一个报告,显示哪些记录没有附加到合同,但表格的设计方式使这比预期的更难。

Recording
------------------------------
recordingId : INT PK, IDENTITY
agentId : INT, FK
filename : NVARCHAR(255)


ContractRecording
----------------------------------
recordingId : INT PK, IDENTITY
contractNumber : INT
created : DATETIME
username : NVARCHAR(20)
note : NVARCHAR(MAX)
fileLocation : NVARCHAR(max)

如果ContractRecording.recordingId是Recording.recordingId的外键引用,那么这很容易,但它不是。它是自己的身份密钥。表之间的唯一链接是文件位置,但Recording.filename仅存储文件名,而ContractRecording.fileLocation存储完整路径。是的,我知道,但我没有设计这些表格。幸运的是,有一个模式,完整的路径来自Agent的名称和录制日期,我们可以从Recording表中的数据中知道这两个。但是当然还有另一个问题:文件路径的格式在大约一年前发生了变化,一些录制内容以旧格式存储,一些录制以新格式存储。

旧格式:C:\ John-Recordings \ 2015 \ 06 \ 15-0811.wav

新格式:C:\ Recordings \ John Smith \ 2015 \ 06 \ 15-0811.wav

问题:

为了链接这两个表,我必须在记录的完整路径上加入它们,这些记录必须在Recording表上手动构建,并且可以采用两种格式之一。我最初尝试在JOIN子句中使用OR,但大约需要8分钟才能返回大约15k行,这是不可接受的。然后我尝试使用两个LEFT OUTER JOIN - 每个条件一个 - 但是用了十分钟就可以得到看似相同的数据。我想这是因为我加入了一个自定义字段,它没有被编入索引。将其拆分为两个SELECT并使用UNION导致重复行,每个查询将为每个记录返回一行。我还有其他选择可以将此查询缩短到几秒钟吗?这是我使用OR子句的原始查询。

SELECT * FROM
    (SELECT 
        cr.recordingid AS "attachedrecordingid"
        ,rec.recordingid AS "rawrecordingid"
        ,cr.contractnumber
        ,cr.created
        ,rec.name
        ,cr.note
        ,cr.filelocation
        ,rec.filename
        ,rec.recordtime
    FROM ContractRecording cr
    RIGHT OUTER JOIN
    (SELECT 
        recordingid
        ,a.name
        ,filename
        ,retain
        ,r.recordtime
        ,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
        ,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
    FROM Recording r
    JOIN Agents a
        ON r.agentid = a.agentid) rec
    ON cr.filelocation = rec.fullpathold OR cr.filelocation = rec.fullpathnew) main
ORDER BY main.name, main.recordtime

报告需要为记录表中的所有记录显示一行(除非单个记录附加到多个合同,在这种情况下,每个配对应显示一行),如果有任何行,则显示来自ContractRecording的数据与任一文件位置格式匹配。

如果绝对必要,我不反对只从两个表中提取所有数据并通过代码链接它们,但这是最后的选择。

UPDATE:

根据要求,这里是用于分析的UNION版本的查询。如上所述,它为每个配对返回两行 - 一个包含数据,另一个没有。这是因为两个JOIN中至少有一个总是没有匹配,但我只想在其他JOIN确实匹配时忽略它们。如果JOIN都不匹配,我也只想显示一次。我没有信心,我可以通过UNION实现我想要的结果而不是其他可能性,所以我没有采用这种方法。

SELECT * FROM
    ((SELECT 
        cr.recordingid AS "attachedrecordingid"
        ,rec.recordingid AS "rawrecordingid"
        ,cr.contractnumber
        ,cr.created
        ,rec.name
        ,cr.note
        ,cr.filelocation
        ,rec.filename
        ,rec.recordtime
    FROM ContractRecording cr
    RIGHT OUTER JOIN
    (SELECT 
        recordingid
        ,a.name
        ,filename
        ,retain
        ,r.recordtime
        ,'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathold"
    FROM Recording r
    JOIN Agents a
        ON r.agentid = a.agentid) rec
    ON cr.filelocation = rec.fullpathold)
    UNION
    (SELECT 
        cr.recordingid AS "attachedrecordingid"
        ,rec.recordingid AS "rawrecordingid"
        ,cr.contractnumber
        ,cr.created
        ,rec.name
        ,cr.note
        ,cr.filelocation
        ,rec.filename
        ,rec.recordtime
    FROM ContractRecording cr
    RIGHT OUTER JOIN
    (SELECT 
        recordingid
        ,a.name
        ,filename
        ,retain
        ,r.recordtime
        ,'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename AS "fullpathnew"
    FROM Recording r
    JOIN Agents a
        ON r.agentid = a.agentid) rec
    ON cr.filelocation = rec.fullpathnew)) main
ORDER BY main.name, main.recordtime

1 个答案:

答案 0 :(得分:2)

您可以尝试使用LIKE

WITH AgentRecordings AS
(
    SELECT  
        a.name,
        r.recordingId AS rawrecordingid,
        r.filename,
        r.recordtime,
        CONCAT(
            'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + FILENAME,
            'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
        ) AS filepaths
    FROM
        Agents a
        JOIN Recording r ON a.agentId = r.agentId
)
SELECT
    cr.recordingid AS "attachedrecordingid"
    ,rec.recordingid AS "rawrecordingid"
    ,cr.contractnumber
    ,cr.created
    ,rec.name
    ,cr.note
    ,cr.filelocation
    ,rec.filename
    ,rec.recordtime
FROM
    AgentRecordings rec
    LEFT JOIN ContractRecording cr ON rec.filepaths LIKE '%' + cr.filelocation + '%'

如果这有帮助..我也会尝试创建一个临时表而不是使用cte,看看是否有帮助。

您也可以尝试将两个OR语句拆分为2个cte并使用联合来组合找到的记录ID

WITH fullpathnew AS
(
    SELECT  cr.recordingid AS "attachedrecordingid",
            rec.recordingid AS "rawrecordingid",
            cr.contractnumber,
            cr.created,
            cr.note,
            cr.filelocation
    FROM    Agents a
            JOIN Recording r ON a.agentId = r.agentId
            JOIN ContractRecording cr ON cr.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
),
fullpathold AS
(
    SELECT  cr.recordingid AS "attachedrecordingid",
            rec.recordingid AS "rawrecordingid",
            cr.contractnumber,
            cr.created,
            cr.note,
            cr.filelocation
    FROM    Agents a
            JOIN Recording r ON a.agentId = r.agentId
            JOIN ContractRecording cr ON cr.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
)
combinedCtes AS
(
    SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathnew
    UNION SELECT attachedrecordingid, rawrecordingid, contractnumber, created, note, filelocation FROM fullpathold
)
SELECT  cte.attachedrecordingid
        ,r.recordingid AS "rawrecordingid"
        ,cte.contractnumber
        ,cte.created
        ,a.name
        ,cte.note
        ,cte.filelocation
        ,r.filename
        ,r.recordtime
FROM    Agents a
        JOIN Recording r ON r.agentId = a.agentId
        LEFT JOIN combinedCtes cte ON r.recordingid = cte.rawrecordingid

您的UNION需要在子选择中,然后您可以加入该子查询

SELECT  j.attachedrecordingid
        ,r.recordingid AS rawrecordingid
        ,j.contractnumber
        ,j.created
        ,a.NAME
        ,j.note
        ,j.filelocation
        ,r.filename      
        ,r.recordtime
FROM    Agents a
        JOIN Recording r ON a.agentId = r.agentId
        LEFT JOIN(
            SELECT  cr.recordingid AS "attachedrecordingid"
                    ,rec.recordingid AS "rawrecordingid"
                    ,cr.contractnumber
                    ,cr.created
                    ,cr.note
                    ,cr.filelocation
            FROM    Agents a 
                    JOIN Recording r
                    JOIN ContractRecording cr 
                        ON cr1.filelocation = 'C:\' + SUBSTRING(a.name, 0, CHARINDEX(' ', a.name, 0)) + '-Recordings\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename
            UNION
            SELECT  cr.recordingid AS "attachedrecordingid"
                    ,rec.recordingid AS "rawrecordingid"
                    ,cr.contractnumber
                    ,cr.created
                    ,cr.note
                    ,cr.filelocation
            FROM    Agents a 
                    JOIN Recording r
                    JOIN ContractRecording cr 
                        ON cr1.filelocation = 'C:\Recordings\' + a.name + '\' + CONVERT(NVARCHAR(4), DATEPART(yyyy, recordtime)) + '\' + CONVERT(NVARCHAR(2), DATEPART(m, recordtime)) + '\' + filename


        ) j ON r.recordingId = j.rawrecordingid
ORDER BY a.name, r.recordtime