Question

我有一个如下所示的查询，在SQL Server 2008中执行

SELECT
    ipm.HEORG_REFNO,    
    ipm.HOTYP_REFNO,     
    ipm.CASLT_REFNO,     
    ipm.HOLVL_REFNO,    
    IPM.MAIN_IDENT,  
     ...  
FROM  
    dbo.HEALTH_ORGANISATIONS ipm  (NOLOCK)             
LEFT JOIN
    (SELECT
         s.heorg_refno, min(s.start_dttm) as start_dttm_SPONT, max(isnull(convert(datetime,s.end_dttm,120),convert(datetime,'9999-01-01', 120))) as end_dttm_SPONT    
     FROM
         dbo.service_points s (NOLOCK)    
     INNER JOIN
         dbo.reference_values rfval (NOLOCK) ON s.SPTYP_REFNO = rfval.RFVAL_REFNO    
                                             AND RFVAL.MAIN_CODE != 'PDT'         
     GROUP BY 
         s.heorg_refno) SPONT ON ipm.HEORG_REFNO = SPONT.HEORG_REFNO    

 -- Bring only Health Organisation records and also certain records,whose HOTYP_REFNO does not exist in REF_VALS
WHERE
    NOT EXISTS ((SELECT 'x' 
                 FROM REFERENCE_VALUES RVAL (NOLOCK) 
                 WHERE RVAL.RFVAL_REFNO = ipm.HOTYP_REFNO 
                   AND main_code IN ('011','012','015','016',  '017','019','2','AANDE','AEB','AEC','CLINIC','DAYCC','DEPRT','GPSIT','HC','HOSPL','HOST','LOCTN','LOSYN','MIU','MISC','MRL', 'SITE','THEAT','WARD','PDT','NURHM','DAYCR') 
    or ipm.HEORG_REFNO IN(select distinct HEORG_REFNO from SERVICE_POINT_SESSIONS (NOLOCK) where OWNER_HEORG_REFNO = 2001934 and HEORG_REFNO != 2001934) 
    or ipm.HEORG_REFNO IN (select REFNO from LOR_IPM_SYNTH_STG_DEV..  STAGING_Activity_LOCATION_DCS (NOLOCK) where Sources='HEORG_REFNO' and REFNO != 2001934)  
    )
  )

执行查询需要花费大量时间。

当我评论以下2行时，它运行得更快：

or ipm.HEORG_REFNO IN(select distinct HEORG_REFNO from SERVICE_POINT_SESSIONS (NOLOCK) where OWNER_HEORG_REFNO = 2001934 and HEORG_REFNO != 2001934) 
    or ipm.HEORG_REFNO IN (select REFNO from LOR_IPM_SYNTH_STG_DEV..  STAGING_Activity_LOCATION_DCS (NOLOCK) where Sources='HEORG_REFNO' and REFNO != 2001934)

感谢您在调整查询时提供的任何指导

Answer 1

我的第一个想法是你的查询非常复杂 - 我会寻找简化它的方法......

在条款中，不要总是表现良好 - 我很想将这些信息吸收到禁止的＆＃34; main_Codes＆＃34;的表变量中，左连接到它并测试空...

时间运行执行计划，看看你的瓶颈实际上是哪里，这取决于你自己的环境（索引，统计等）...

Answer 2

尝试将这些def silhouette_score(estimator, X): clusters = estimator.fit_predict(X) score = metrics.silhouette_score(distance_matrix, clusters, metric='precomputed') return score ca = KMeans() param_grid = {"n_clusters": range(2, 11)} # run randomized search search = GridSearchCV( ca, param_distributions=param_dist, n_iter=n_iter_search, scoring=silhouette_score, cv= # can I pass something here to only use a single fold? ) search.fit(distance_matrix)子查询转换为$allowed = array( 'a' => array( // on allow a tags 'href' => array() // and those anchords can only have href attribute ), 'h2' => array() );查询，如下所示，并确保在连接条件和过滤条件所涉及的所有列上都创建了正确的索引。

IN

我会将您的查询修改为如下所示。虽然我现在无法对您的大型列表进行任何操作，但您应该将该列表放入表变量中，并考虑使用JOIN。

LEFT JOIN SERVICE_POINT_SESSIONS sps ON ipm.HEORG_REFNO = sps.HEORG_REFNO
AND sps.OWNER_HEORG_REFNO = 2001934 
AND sps.HEORG_REFNO != 2001934

执行效果不佳的SQL查询

2 个答案: