我有两个巨大的阵列:
所以基本上是我的阵列:
A = array (
[0] => array(
"name" => "KE-KE IMPEX ",
"email" => "someemai@gmail.com",
"kezbCompany" => "Fragrance",
"startDate" => "2013-03-25 00:00:00",
"endDate" => "2014-03-25 00:00:00",
"companyBase" => "06 20 232 2534"
)
...
[4900] => array(
"name" => "Jane Doe",
"email" => "zzer@sad.com",
"kezbCompany" => "sadsad",
"startDate" => "2013-03-25 00:00:00",
"endDate" => "2014-03-25 00:00:00",
"companyBase" => "06 20 232 2534"
)
)
B = array (
[0] => array(
"name" => "KE-KE IMPEX 46554 sda",
"email" => "xxx@gmail.com",
"kezbCompany" => "546wer",
"startDate" => "2013-03-25 00:00:00",
"endDate" => "2014-03-25 00:00:00",
"companyBase" => "06 20 232 2534"
)
...
[700] => array(
"name" => "45 Jane Doe",
"email" => "kekeimpex@gmail.com",
"kezbCompany" => "asd",
"startDate" => "2013-03-25 00:00:00"
)
)
小项目看起来像这样(例如A和B中的展位):
array(
'name' => 'John Doe',
'email' => 'john@doe.com'
)
所以我需要做的是:检查哪个小数组具有相同的名称。
但请记住,大多数时候两个小阵列的结构不一样。
例如,他们的电子邮件可能不同。现在,如果我首先循环通过A,我循环通过B,这需要花费很多时间。
这是我目前的代码:
$szData = file_get_contents('szData.txt');
$kData = file_get_contents('kData.txt');
$A = json_decode($szData);
$B = json_decode($kData);
$foundNr = 0;
foreach ($A as $key => $sz)
{
$cName = $sz->companyName;
foreach ($B as $index => $k)
{
$pattern = '/^(.*)+('.$cName.')/i';
echo "SzSor: " . $key . " --- Ksor: " . $index . "</br>";
if (preg_match($pattern, $k->companyName))
{
$founData[] = $k->companyName;
++$foundNr;
}
}
}
有什么想法吗?
答案 0 :(得分:0)
一种解决方案是首先以压缩字符串(John Doe-john@doe.com
)合并子数组中的所有数据,然后使用快速函数来比较匹配的字符串:
$szData = file_get_contents('szData.txt');
$kData = file_get_contents('kData.txt');
$A = json_decode($szData);
$B = json_decode($kData);
$foundNr = 0;
$B_sample = array();
foreach ($B as $index => $data){
$B_sample[$index] = implode('-',$data);
//or use some other custom procedure to create unique string from the data.
}
$B_sample = array();
foreach ($A as $index => $data){
$A_sample[$index] = implode('-',$data);
}
$A_B_intersect = array_intersect ($A_sample , $B_sample ); // the KEYS in the result array are from $A
$foundNr = count($A_B_intersect);
foreach($A_B_intersect as $key->$data){
$founData[] = $A[$key]->companyName;
}
如果您可以在构建$A_sample
时创建$A
,并重复使用相同的循环,则会有额外的好处。
答案 1 :(得分:-1)
您可能最好避免在此处使用自己的循环等,并尝试使用内置的PHP函数,如:
从文档中获取的代码:
<?php
$os = array("Mac", "NT", "Irix", "Linux");
if (in_array("Irix", $os)) {
echo "Got Irix";
}
if (in_array("mac", $os)) {
echo "Got mac";
}
?>
The second condition fails because in_array() is case-sensitive, so the program above will display:
Got Irix
Example #2 in_array() with strict example
<?php
$a = array('1.10', 12.4, 1.13);
if (in_array('12.4', $a, true)) {
echo "'12.4' found with strict check\n";
}
if (in_array(1.13, $a, true)) {
echo "1.13 found with strict check\n";
}
?>
The above example will output:
1.13 found with strict check
因此,虽然您可能仍需要遍历一个数组来比较所有可能的值,但最有效的代码通常来自使用内置函数。
说了这么多,写下几个简单的循环并测试它,看看性能最好。如果您知道数据的结构,那么您可以完全跳过许多元素 - 例如,如果您知道数据将按字母顺序排列,并且您正在寻找以&#34; S& #34;,做一个一次跳过100条记录的循环,直到你到达&#34; S&#34; value - 然后返回到它之前的最后一个条目。像这样的东西。这种思维方式可以实现快速有效的代码 - 如果你可以跳过5000个中第一个3500个元素中的99个,则不会逐个元素地执行搜索。