很少发生的事情是一对一,其中第二个表可以为第一个表创建数百万个结果。例如,我有一个' radcliente'有数百万' radacct'但需要仅使用最后一个acct进行过滤的表格。以下是更好解释的示例:
这是标准:
$criteria = new CDbCriteria();
$criteria->with = [
'acct', // slow because it will take millions of lines to have only the last
];
$criteria->together = true;
$clientes = Cliente::model()->findAll($criteria);
这是由Yii生成的查询(非常慢,超过40秒,它返回数百万行以仅在AR中使用一个):
SELECT
`t`.`id` AS `t0_c0`,
-- ...
`t`.`spc_serasa` AS `t0_c56`,
`acct`.`radacctid` AS `t1_c0`,
-- ...
`acct`.`cliente_id` AS `t1_c27`
FROM
`radcliente` `t`
LEFT OUTER JOIN `radacct` `acct` ON (`acct`.`cliente_id`=`t`.`id`)
ORDER BY
radacctid DESC
将我的解决方案限制连接应用到一行(这很快!200ms - ):
SELECT
`t`.`id` AS `t0_c0`,
..
`t`.`spc_serasa` AS `t0_c56`,
`acct`.`radacctid` AS `t1_c0`,
-- ...
`acct`.`cliente_id` AS `t1_c27`
FROM
`radcliente` `t`
LEFT OUTER JOIN `radacct` `acct` ON (
acct.radacctid = (
SELECT radacctid
FROM `radacct` `acct`
WHERE (acct.cliente_id = t.id)
ORDER BY radacctid DESC
LIMIT 1
)
)
这是CActiveDataProvider生成的查询总计项目数,我的限制连接解决方案为1(慢,10秒计数):
SELECT
COUNT(*)
FROM (
SELECT
`t`.`id` AS `t0_c0`,
-- ...
`t`.`spc_serasa` AS `t0_c56`,
`endereco_instalacao`.`id` AS `t1_c0`,
`telefones`.`id` AS `t2_c0`,
`telefones`.`telefone` AS `t2_c3`,
`emails`.`id` AS `t3_c0`,
`emails`.`email` AS `t3_c3`,
`metodo_cobranca`.`id` AS `t4_c0`,
`acct`.`radacctid` AS `t5_c0`,
`acct`.`framedipaddress` AS `t5_c22`
FROM
`radcliente` `t`
LEFT OUTER JOIN `radcliente_endereco_instalacao` `endereco_instalacao` ON (
endereco_instalacao.id = (
SELECT id
FROM `radcliente_endereco_instalacao` `endereco_instalacao`
WHERE (
endereco_instalacao.cliente_id = t.id
)
LIMIT 1
)
)
LEFT OUTER JOIN `radcliente_telefone` `telefones` ON (`telefones`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radcliente_email` `emails` ON (`emails`.`cliente_id`=`t`.`id`)
LEFT OUTER JOIN `radmetodo_cobranca` `metodo_cobranca` ON (
metodo_cobranca.id = (
SELECT id
FROM `radmetodo_cobranca` `metodo_cobranca`
WHERE (metodo_cobranca.cliente_id = t.id)
AND (metodo_cobranca.arquivo = 'nao')
ORDER BY metodo_cobranca.id DESC
LIMIT 1
)
)
LEFT OUTER JOIN `radacct` `acct` ON (
acct.radacctid = (
SELECT radacctid
FROM `radacct` `acct`
WHERE (acct.cliente_id = t.id)
ORDER BY radacctid DESC
LIMIT 1
)
)
GROUP BY t.id
) sq
但问题在于CActiveDataProvider生成的计数(返回结果大约10秒)会有一种优化方式而不必丢失关系(因为我需要在将来按关系过滤)?
更新
感谢您的回复。我一直在做一些测试,并注意到在所有情况下都很慢,表格' radacct'通过其大小加剧问题,因此不应限制子查询中的1。如果您需要进行身份验证,请按照模型和链接访问系统:
访问:
http://177.86.111.30/dev2/teste
用户名:help
密码:1
下载radcliente和radacct的模型和架构:http://177.86.111.30/files.zip
答案 0 :(得分:0)
而不是ON id = ( SELECT ... LIMIT 1 )
尝试添加另一个JOIN(不是LEFT JOIN):
JOIN ( SELECT ... LIMIT 1 ) x ON ...
我对你的代码的恐惧是它会在需要检查ON子句时反复评估该子查询。我的重写将导致子查询只发生一次。
您的查询看起来像是一个“相关”子查询,因此如果可能,您需要将其重新定义为非相关。