尝试基于具有IN连接的四个字段来查询优化连接

时间:2017-02-25 05:07:14

标签: sql sql-server

我有这个问题:

SELECT 
      RTPropertyUniqueIdentifier
      ,[FA Unique Listing Identifier - Ref ID]
      ,MLS.[Listing Tracking ID]
      , MLS.[Assessor's Parcel Identification Number] as MLS_PARCELID
      , PP.ParcelID
      , GEOID
      , CASE 
        WHEN PP.PP_ParcelID = MLS.MLS_PARCELID then 'PARCEL MATCH'
        Else 'ADDRESS MATCH' END as MATCHTYPE
      ,[Update Timestamp]
          INTO PROPERTY.DBO.PP_MLS_Bridge
  FROM STAGE.DBO.STAGE_MLS_BRG MLS
  join STAGE.DBO.STAGE_PP_BRG PP on 
    (
        PP.PP_ParcelID = MLS.MLS_PARCELID
    ) OR (
        lower(MLS.StreetNum) = lower(PP.AddNum)
        and (lower(PP.AddStreet) like '%'+lower(MLS.Street_Name)+'%' or lower(MLS.Street_Name) like CONCAT('%', lower(PP.AddStreet), '%'))
        and (lower(PP.AddUnitNum) like '%' + lower(MLS.Unit) + '%' or lower(MLS.Unit) like CONCAT('%', lower(PP.AddUnitNum), '%'))
        and lower(PP.AddCity) = lower(MLS.[Property City])
        and lower(PP.AddState) = lower(MLS.[Property State])
        and lower(PP.AddZip) = lower(MLS.[Property Zip])
);

MLS表强劲有5000万条记录。 PP表的实力为1.85亿。这个查询将连续运行8天以上,这意味着我不确定我是否也进行了优化。我正在寻找一种加快速度的方法。

由于

更新1:执行计划: enter image description here

更新2:更新了SQL语句

INSERT PROPERTY.DBO.PP_MLS_Bridge
SELECT 
      RTPropertyUniqueIdentifier
      ,[FA Unique Listing Identifier - Ref ID]
      ,MLS.[Listing Tracking ID]
      , MLS.[Assessor's Parcel Identification Number] as MLS_PARCELID
      , PP.ParcelID
      , GEOID
      , CASE 
        WHEN PP.PP_ParcelID = MLS.MLS_PARCELID then 'PARCEL MATCH'
        Else 'ADDRESS MATCH' END as MATCHTYPE
      ,[Update Timestamp]
  FROM STAGE.DBO.STAGE_MLS_BRG MLS
  join STAGE.DBO.STAGE_PP_BRG PP on PP.PP_ParcelID = MLS.MLS_PARCELID
UNION
SELECT 
      RTPropertyUniqueIdentifier
      ,[FA Unique Listing Identifier - Ref ID]
      ,MLS.[Listing Tracking ID]
      , MLS.[Assessor's Parcel Identification Number] as MLS_PARCELID
      , PP.ParcelID
      , GEOID
      , CASE 
        WHEN PP.PP_ParcelID = MLS.MLS_PARCELID then 'PARCEL MATCH'
        Else 'ADDRESS MATCH' END as MATCHTYPE
      ,[Update Timestamp]
  FROM STAGE.DBO.STAGE_MLS_BRG MLS
  join STAGE.DBO.STAGE_PP_BRG PP on
        MLS.StreetNum = PP.AddNum
        and (PP.AddStreet like '%'+MLS.Street_Name+'%' or MLS.Street_Name like CONCAT('%', PP.AddStreet, '%'))
        and (PP.AddUnitNum like '%' + MLS.Unit + '%' or MLS.Unit like CONCAT('%', PP.AddUnitNum, '%'))
        and PP.AddCity = MLS.[Property City]
        and PP.AddState = MLS.[Property State]
        and PP.AddZip = MLS.[Property Zip];

执行计划: 第1部分 enter image description here

第2部分 enter image description here

1 个答案:

答案 0 :(得分:1)

好的,你有2次可怕的表扫描。所以,我可以给你一些建议:

  1. 删除LOWER。 SQL Server比较忽略大小写的字符串
  2. PP.PP_ParcelIDMLS.MLS_PARCELID
  3. 上添加索引
  4. 使用OR将2个选项替换为您的联接中的UNION - 查询优化程序将选择更好的计划
  5. 尝试使用简单整数键(如lower(PP.AddZip) = lower(MLS.[Property Zip]))上的某些连接替换PP.AddZipId = MLS.PropertyZipId之类的字符串上的所有连接。比较整数更简单