我有一个如下所示的查询,在SQL Server 2008中执行
SELECT
ipm.HEORG_REFNO,
ipm.HOTYP_REFNO,
ipm.CASLT_REFNO,
ipm.HOLVL_REFNO,
IPM.MAIN_IDENT,
...
FROM
dbo.HEALTH_ORGANISATIONS ipm (NOLOCK)
LEFT JOIN
(SELECT
s.heorg_refno, min(s.start_dttm) as start_dttm_SPONT, max(isnull(convert(datetime,s.end_dttm,120),convert(datetime,'9999-01-01', 120))) as end_dttm_SPONT
FROM
dbo.service_points s (NOLOCK)
INNER JOIN
dbo.reference_values rfval (NOLOCK) ON s.SPTYP_REFNO = rfval.RFVAL_REFNO
AND RFVAL.MAIN_CODE != 'PDT'
GROUP BY
s.heorg_refno) SPONT ON ipm.HEORG_REFNO = SPONT.HEORG_REFNO
-- Bring only Health Organisation records and also certain records,whose HOTYP_REFNO does not exist in REF_VALS
WHERE
NOT EXISTS ((SELECT 'x'
FROM REFERENCE_VALUES RVAL (NOLOCK)
WHERE RVAL.RFVAL_REFNO = ipm.HOTYP_REFNO
AND main_code IN ('011','012','015','016', '017','019','2','AANDE','AEB','AEC','CLINIC','DAYCC','DEPRT','GPSIT','HC','HOSPL','HOST','LOCTN','LOSYN','MIU','MISC','MRL', 'SITE','THEAT','WARD','PDT','NURHM','DAYCR')
or ipm.HEORG_REFNO IN(select distinct HEORG_REFNO from SERVICE_POINT_SESSIONS (NOLOCK) where OWNER_HEORG_REFNO = 2001934 and HEORG_REFNO != 2001934)
or ipm.HEORG_REFNO IN (select REFNO from LOR_IPM_SYNTH_STG_DEV.. STAGING_Activity_LOCATION_DCS (NOLOCK) where Sources='HEORG_REFNO' and REFNO != 2001934)
)
)
执行查询需要花费大量时间。
当我评论以下2行时,它运行得更快:
or ipm.HEORG_REFNO IN(select distinct HEORG_REFNO from SERVICE_POINT_SESSIONS (NOLOCK) where OWNER_HEORG_REFNO = 2001934 and HEORG_REFNO != 2001934)
or ipm.HEORG_REFNO IN (select REFNO from LOR_IPM_SYNTH_STG_DEV.. STAGING_Activity_LOCATION_DCS (NOLOCK) where Sources='HEORG_REFNO' and REFNO != 2001934)
感谢您在调整查询时提供的任何指导
答案 0 :(得分:2)
我的第一个想法是你的查询非常复杂 - 我会寻找简化它的方法......
在条款中,不要总是表现良好 - 我很想将这些信息吸收到禁止的" main_Codes"的表变量中,左连接到它并测试空...
时间运行执行计划,看看你的瓶颈实际上是哪里,这取决于你自己的环境(索引,统计等)...
答案 1 :(得分:1)
尝试将这些def silhouette_score(estimator, X):
clusters = estimator.fit_predict(X)
score = metrics.silhouette_score(distance_matrix, clusters, metric='precomputed')
return score
ca = KMeans()
param_grid = {"n_clusters": range(2, 11)}
# run randomized search
search = GridSearchCV(
ca,
param_distributions=param_dist,
n_iter=n_iter_search,
scoring=silhouette_score,
cv= # can I pass something here to only use a single fold?
)
search.fit(distance_matrix)
子查询转换为$allowed = array(
'a' => array( // on allow a tags
'href' => array() // and those anchords can only have href attribute
),
'h2' => array()
);
查询,如下所示,并确保在连接条件和过滤条件所涉及的所有列上都创建了正确的索引。
IN
我会将您的查询修改为如下所示。虽然我现在无法对您的大型列表进行任何操作,但您应该将该列表放入表变量中,并考虑使用JOIN
。
LEFT JOIN SERVICE_POINT_SESSIONS sps ON ipm.HEORG_REFNO = sps.HEORG_REFNO
AND sps.OWNER_HEORG_REFNO = 2001934
AND sps.HEORG_REFNO != 2001934