如何处理/优化数千个不同的执行SELECT查询?

时间:2014-10-30 20:02:53

标签: php mysql sql-server pdo

我需要在两个数据库(一个mysql,另一个是远程托管的SQL Server数据库)之间同步数千行的特定信息。当我执行这个php文件时,它会在几分钟后卡住/超时我想,所以我想知道如何解决这个问题,也许还可以优化“同步”它的方式。

代码需要做什么:

基本上我想获取我的数据库中每一行(=一个帐户)的更新 - 来自另一个SQL Server数据库的两个特定信息(= 2 SELECT个查询)。因此,我使用foreach循环为每行创建2个SQL查询,然后我将这些信息更新为该行的2列。我们谈论需要通过这个foreach循环运行的~10k行。

我的想法可能会有所帮助吗?

我听说过像PDO事务这样的事情应该收集所有这些查询,然后在所有SELECT个查询的包中发送它们,但是我不知道我是否正确使用它们或者它们是否有帮助例。

这是我目前的代码,几分钟后超时:

// DBH => MSSQL DB | DB => MySQL DB
$dbh->beginTransaction();
// Get all referral IDs which needs to be updated:
$listAccounts = "SELECT * FROM Gifting WHERE refsCompleted <= 100 ORDER BY idGifting ASC";
$ps_listAccounts = $db->prepare($listAccounts);
$ps_listAccounts->execute();

foreach($ps_listAccounts as $row) {
    $refid=$row['refId'];
    // Refsinserted
    $refsInserted = "SELECT count(username) as done FROM accounts WHERE referral='$refid'";
    $ps_refsInserted = $dbh->prepare($refsInserted);
    $ps_refsInserted->execute();
    $row = $ps_refsInserted->fetch();
    $refsInserted = $row['done'];

    // Refscompleted
    $refsCompleted = "SELECT count(username) as done FROM accounts WHERE referral='$refid' AND finished=1";
    $ps_refsCompleted = $dbh->prepare($refsCompleted);
    $ps_refsCompleted->execute();
    $row2 = $ps_refsCompleted->fetch();
    $refsCompleted = $row2['done'];

    // Update fields for local order db
    $updateGifting = "UPDATE Gifting SET refsInserted = :refsInserted, refsCompleted = :refsCompleted WHERE refId = :refId";
    $ps_updateGifting = $db->prepare($updateGifting);

    $ps_updateGifting->bindParam(':refsInserted', $refsInserted);
    $ps_updateGifting->bindParam(':refsCompleted', $refsCompleted);
    $ps_updateGifting->bindParam(':refId', $refid);
    $ps_updateGifting->execute();
    echo "$refid: $refsInserted Refs inserted / $refsCompleted Refs completed<br>";
}

$dbh->commit();

1 个答案:

答案 0 :(得分:2)

您可以在一个查询中使用相关的子查询完成所有这些操作:

UPDATE Gifting
SET
    refsInserted=(SELECT COUNT(USERNAME)
                    FROM accounts
                    WHERE referral=Gifting.refId),
    refsCompleted=(SELECT COUNT(USERNAME)
                    FROM accounts
                    WHERE referral=Gifting.refId
                        AND finished=1)

相关子查询实质上是使用引用父查询的子查询(查询中的查询)。请注意,在每个子查询中,我引用了每个子查询的where子句中的Gifting.refId列。虽然这对性能来说并不是最好的,因为每个子查询仍然必须独立于其他查询运行,它会比你在那里执行得更好(并且可能和你想要的一样好)

编辑:

仅供参考。我不知道交易是否会对此有所帮助。通常,当您有多个相互依赖的查询时,会使用它们,并在出现故障时为您提供回滚方法。例如,银行交易。在插入购买之前,您不希望余额扣除一些金额。如果由于某种原因购买失败,您希望将更改回滚到余额。因此,在插入购买时,您启动一​​个事务,运行更新余额查询和插入购买查询,并且只有在两者都正确并且已经过验证的情况下才提交保存。

EDIT2:

如果我这样做,没有进行导出/导入,这就是我要做的。这虽然做了一些假设。首先是你使用的是mssql 2008或更新版本,第二个是引用ID始终是一个数字。我还使用了一个临时表来插入数字,因为您可以使用单个查询轻松插入多行,然后运行单个更新查询来更新gifting表。此临时表遵循结构CREATE TABLE tempTable (refId int, done int, total int)

//get list of referral accounts
//if you are using one column, only query for one column
$listAccounts = "SELECT DISTINCT refId FROM Gifting WHERE refsCompleted <= 100 ORDER BY idGifting ASC";
$ps_listAccounts = $db->prepare($listAccounts);
$ps_listAccounts->execute();

//loop over and get list of refIds from above.
$refIds = array();
foreach($ps_listAccounts as $row){
    $refIds[] = $row['refId'];
}


if(count($refIds) > 0){
    //implode into string for use in query below
    $refIds = implode(',',$refIds);

    //select out total count
    $totalCount = "SELECT referral, COUNT(username) AS cnt FROM accounts WHERE referral IN ($refIds) GROUP BY referral";
    $ps_totalCounts = $dbh->prepare($totalCount);
    $ps_totalCounts->execute();

    //add to array of counts
    $counts = array();

    //loop over total counts
    foreach($ps_totalCounts as $row){
        //if referral id not found, add it
        if(!isset($counts[$row['referral']])){
            $counts[$row['referral']] = array('total'=>0,'done'=>0);
        }
        //add to count
        $counts[$row['referral']]['total'] += $row['cnt'];
    }

    $doneCount = "SELECT referral, COUNT(username) AS cnt FROM accounts WHERE finished=1 AND referral IN ($refIds) GROUP BY referral";
    $ps_doneCounts = $dbh->prepare($doneCount);
    $ps_doneCounts->execute();

    //loop over total counts
    foreach($ps_totalCounts as $row){
        //if referral id not found, add it
        if(!isset($counts[$row['referral']])){
            $counts[$row['referral']] = array('total'=>0,'done'=>0);
        }
        //add to count
        $counts[$row['referral']]['done'] += $row['cnt'];
    }

    //now loop over counts and generate insert queries to a temp table.
    //I suggest using a temp table because you can insert multiple rows
    //in one query and then the update is one query.
    $sqlInsertList = array();
    foreach($count as $refId=>$count){
        $sqlInsertList[] = "({$refId}, {$count['done']}, {$count['total']})";
    }

    //clear out the temp table first so we are only inserting new rows
    $truncSql = "TRUNCATE TABLE tempTable";
    $ps_trunc = $db->prepare($truncSql);
    $ps_trunc->execute();

    //make insert sql with multiple insert rows
    $insertSql = "INSERT INTO tempTable (refId, done, total) VALUES ".implode(',',$sqlInsertList);
    //prepare sql for insert into mssql
    $ps_insert = $db->prepare($insertSql);
    $ps_insert->execute();

    //sql to update existing rows
    $updateSql = "UPDATE Gifting
                    SET refsInserted=(SELECT total FROM tempTable WHERE refId=Gifting.refId),
                        refsCompleted=(SELECT done FROM tempTable WHERE refId=Gifting.refId)
                    WHERE refId IN (SELECT refId FROM tempTable)
                        AND refsCompleted <= 100";
    $ps_update = $db->prepare($updateSql);
    $ps_update->execute();
} else {
    echo "There were no reference ids found from \$dbh";
}