如何快速查找已排序数组中的元素

时间:2014-05-30 19:59:49

标签: php arrays csv multidimensional-array ldap

我是新来的,但已经检查过以前有关此事的帖子,虽然类似,但还不足以满足我的目的。我有一个包含40K +记录的CSV文件,并检索70K +记录的LDAP记录;两者都存储在多维数组变量中。目标是显示所有不匹配的记录。我目前的解决方案是花费20多分钟来处理,效率很低。我创建了一个外部循环,它为每个记录检查LDAP记录集(内部循环)中的匹配项,如果找到则跳转到下一个记录并取消设置LDAP数组索引以缩小下一个循环的数组。我还按升序对两个数组进行了排序,以加快进程。想法,调整,有助于加快进程?

foreach($csvArray as $csvindex=>$csvalue) { 
echo "<br />csvArray record: <strong> ".$counter."</strong><br />\n";

  if($counter <= 1) {

      for ($i = 0, $max=$rs["count"]-1; $i < $max ;$i++) { //loop through ldap array
            if($csvalue[0] == $rs[$i]['uid'][0]) { // csv netid & ldap netid
                echo "CSV netid: ".$csvalue[0];
                echo "<br />matched LDAP array [$i] netid: ".$rs[$i]["uid"][0];
                echo "<br />\n";
                $matched = $i; //$i represents integer offset in array (ie. $rs[21])
                break;
            }
      }

    } else {

    unset($rs[$matched]); //remove matched items
    $newRS = array_values($rs); //re-indexes array

    echo "Size of new LDAP array: ".count($newRS);

      for ($i=0, $max=count($newRS); $i<$max; $i++) {
          if($csvalue[0] == $newRS[$i]['uid'][0]) { // csv netid & ldap netid
            echo "<br />CSV netid: ".$csvalue[0];
            echo "<br />matched LDAP array [$i] netid: ".$newRS[$i]["uid"][0];
            echo "<br />\n";
            $matched = $i; //$i represents integer offset in array (ie. $rs[21])
            break;
          }
      }

    } 

$counter++;
}

原始数组的外观示例(为安全起见,某些信息已更改):

 //csvArray 
 Array (
 [0] => Array
    (
        [0] => ABABABAB
        [1] => test.account
        [2] => Chad
        [3] => Moeller
        [4] => chad.moeller@macmillan.com
        [5] => 9/10/2013 9:29 AM
    )

[1] => Array
    (
        [0] => D2L1.Test
        [1] => w40
        [2] => D2L 
        [3] => Test
        [4] => 
        [5] => 10/28/2013 4:24 PM
    )

//ldap multidimensional array
Array (
[count] => 67
[0] => Array
    (
        [uid] => Array
            (
                [count] => 1
                [0] => alackey1
            )

        [0] => uid
        [count] => 1
        [dn] => uid=alackey1,dc=edu
    )

[1] => Array
    (
        [uid] => Array
            (
                [count] => 1
                [0] => blamb3
            )

        [0] => uid
        [count] => 1
        [dn] => uid=blamb3,dc=edu
    )

1 个答案:

答案 0 :(得分:0)

这里有一些替换内循环的代码。它使用二进制搜索。必须在此点之前对LDAP数组进行排序。

$workingArray=$newRS;
while($LDAPcount=count($workingArray)) {
    $indexToCheck=ceil($LDAPcount/2);
    if($csvalue[0] == $workingArray[$indexToCheck]['uid'][0]) { // csv netid & ldap netid
        echo "<br />CSV netid: ".$csvalue[0];
        echo "<br />matched LDAP array ".$workingArray[$indexToCheck]["uid"][0];
        echo "<br />\n";
        $matched = $indexToCheck; //$indexToCheck represents integer offset in array (ie. $rs[21])
        break;
    } else {
        if($csvalue[0] < $workingArray[$indexToCheck]['uid'][0]) {
            $workingArray=array_slice($workingArray,0,$indextoCheck);
        } else {
            $workingArray=array_slice($workingArray,$indextoCheck+1);
        }
    }
 }