我有以下存储过程,它评估临时表中的所有帐户并确定它们是否适合导入。如果是,它会将它们标记为suitableToImport = TRUE。如果没有,将给出一个理由。
虽然它的SET基础仍然很慢。我尝试过移动到EXIST而不是COUNT,但是测试似乎并没有表明它会产生很大的影响。
关于可以做什么的任何建议?
CREATE OR REPLACE FUNCTION assessInclusionOfAccountsFromStaging () RETURNS BOOLEAN AS $$ /*Only new accounts are valid, detailed issues checking and budge checking duplicates*/
DECLARE
countOfAccountsInStaging INTEGER;
BEGIN
/*Check that we have data to process*/
countOfAccountsInStaging = COUNT(*) FROM importAccountsStaging;
IF(countOfAccountsInStaging) = 0 THEN
RAISE EXCEPTION 'No accounts available';
END IF;
/*SET SuitableToImport*/
RAISE NOTICE 'Processing accounts...';
UPDATE importAccountsStaging SET suitableToImport = TRUE
WHERE
AND ((SELECT COUNT (*) FROM importAccountsStaging as accountsIterated /*Check for duplicates against staging enviroment at org level*/
WHERE
(accountsIterated.code1 = importAccountsStaging.code1)
OR (accountsIterated.code2 = importAccountsStaging.code2)
)=1)
/*Check for duplicate in masterdb*/
AND ((SELECT COUNT (*) FROM masterAccounts /*Check for any potential duplicate at org level*/
WHERE
(importAccountsStaging.code1 = masterAccounts.code1 )
OR (importAccountsStaging.code2 = masterAccounts.code2 )
)=0)
;
/*SET COMMENT on why it's not suitable to import*/
UPDATE importAccountsStaging SET reason = CONCAT(reason , 'existing account in staging|')
WHERE
NOT ((SELECT COUNT (*) FROM importAccountsStaging as tempAccounts
WHERE
tempAccounts.code1 = importAccountsStaging.code1
OR tempAccounts.code2 = importAccountsStaging.code2
)=1);
/*SET COMMENT on why it's not suitable to import*/
UPDATE importAccountsStaging SET reason = CONCAT(reason , 'existing account in main|')
WHERE
NOT ((SELECT COUNT (*) FROM masterAccounts
WHERE
importAccountsStaging.code1 = masterAccounts.code1
OR importAccountsStaging.code2 = masterAccounts.code2
)=0)
;
/*Return values*/
RAISE NOTICE 'Assessment completed human! ';
RETURN TRUE;
END; $$
LANGUAGE plpgsql;
非常感谢!
答案 0 :(得分:1)
它是已知的反模式 - 通常COUNT(*)
可能是非常慢的操作,因为它必须扫描所有可能的行。基于EXISTS
的测试应该非常快,因为执行在第一行停止。所以较新者使用COUNT进行测试(如果存在或不存在)!始终使用EXISTS
。
答案 1 :(得分:0)
关联子句中的OR
是一个杀手 - 它可能会导致对正在更新的表中的每条记录进行全表扫描。
假设您只是在寻找存在而不是实际数量,我建议:
WHERE (EXISTS (SELECT 1
FROM importAccountsStaging ias
WHERE ias.code1 = importAccountsStaging.code1
) OR
EXISTS (SELECT 1
FROM importAccountsStaging ias
WHERE ias.code2 = importAccountsStaging.code2
)
) AND
(NOT EXISTS (SELECT 1
FROM masterAccounts ma
WHERE importAccountsStaging.code1 = ma.code1
) AND
NOT EXISTS (SELECT 1
FROM masterAccounts ma
WHERE importAccountsStaging.code2 = ma.code2
)
)
然后,您需要importAccountsStaging(code1)
,importAccountsStaging(code2)
,masterAccounts(code1)
和masterAccounts(code2)
上的索引。
如果您正在寻找特定的计数,您也可以修改它的逻辑(它应该几乎同样快)。
答案 2 :(得分:0)
您可能需要考虑完全重新设计以完全摆脱OR。我强烈怀疑如果你将操作分解成更小的块,它将运行得更快。例如,而不是masterAccounts上的SELECT COUNT(*),为什么不这样做:
更新importAccountsStaging SET reason = CONCAT(原因,'main |'中的现有帐户) 来自masterAccounts WHERE importAccountsStaging.code1 = masterAccounts.code1;
更新importAccountsStaging SET reason = CONCAT(原因,'main |'中的现有帐户) 来自masterAccounts WHERE importAccountsStaging.code2 = masterAccounts.code2;
类似于你的其他检查......然后只是结束
UPDATE importAccountsStaging SET properToImport = TRUE 原因是空的
答案 3 :(得分:0)
这是优化的查询,我从所有答案中获取了输入,所以非常感谢您的帮助。这些更改使查询在大约7分钟内从超过10小时开始执行。
CREATE OR REPLACE FUNCTION assessInclusionOfAccountsFromStaging () RETURNS BOOLEAN AS $$ /*Only new accounts are valid, detailed issues checking and budge checking duplicates*/
DECLARE
countOfAccountsInStaging INTEGER;
BEGIN
/*Check that we have data to process*/
countOfAccountsInStaging = COUNT(*) FROM importAccountsStaging;
IF(countOfAccountsInStaging) = 0 THEN
RAISE EXCEPTION 'No accounts available';
END IF;
RAISE NOTICE 'Processing accounts...';
/*Checking value of row against the table the row belongs to for potential duplicates*/
UPDATE importAccountsStaging SET reason = CONCAT(importAccountsStaging.reason , 'existing code1 in staging|')
WHERE ((SELECT COUNT (*) FROM importAccountsStaging as tempAccounts
WHERE
tempAccounts.code1 = importAccountsStaging.code1
)>1);
UPDATE importAccountsStaging SET reason = CONCAT(importAccountsStaging.reason , 'existing code2 in staging|')
WHERE ((SELECT COUNT (*) FROM importAccountsStaging as tempAccounts
WHERE
tempAccounts.code2 = importAccountsStaging.code2
)>1);
/*Checking value of row against another table*/
UPDATE importAccountsStaging SET reason = CONCAT(importAccountsStaging.reason , 'existing code1 in masterDB|')
FROM masterAccounts WHERE importAccountsStaging.code1 = masterAccounts.code1;
UPDATE importAccountsStaging SET reason = CONCAT(importAccountsStaging.reason , 'existing code2 in masterDB|')
FROM masterAccounts WHERE importAccountsStaging.code2 = masterAccounts.code2;
/*Final flag where no issues were found*/
UPDATE importAccountsStaging SET suitableToImport = TRUE
WHERE reason IS NULL;
/*Return values*/
RAISE NOTICE 'Assessment complete, all done! ';
RETURN TRUE;
END; $$
LANGUAGE plpgsql;