我有一个大的xml文件(~350mb)我需要存储到MySQL表中,作为存储在单独列中的正确数据等。
我曾尝试使用LOAD XML,但它从未成功 - 我一直没有内存,也没有输入(我正在托管我无法控制php ini,并且任何增加内存的尝试都允许我自己的php.ini或脚本本身(ini_set(' memory_limit', - 1)等)没有效果。
所以,我现在正在尝试使用XML阅读器来解析xml。我的xml包含大约130,000个条目,每个条目有18个孩子。
我的代码已经到目前为止 - 我可以插入大约75,000行,然后我的内存不足或者我的连接消失了#39。
我的问题是,如何让这段代码更节省内存?
代码:
<?php
mysqli_report(MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT);
print("<br>starting<br>");
// connect
$mysqli = mysqli_connect(CONNECTION STUFF);
// check connection
if (mysqli_connect_error()) {
printf("Connect failed: %s\n", mysqli_connect_error());
exit();
}
// open xml
$xml = new XMLReader;
$xml->open('../xml/jobs.xml');
$sql = $orig_sql = "REPLACE INTO jobs (jobref, date, title, company, email, url, salarymin, salarymax, benefits, salary, jobtype, full_part, salary_per, location, country, description, category, image) VALUES ";
$count = 0;
$chunks = 500;
$total = 0;
function escape_sql($unescaped) {
$replacements = array(
"\x00"=>'\x00',
"\n"=>'\n',
"\r"=>'\r',
"\\"=>'\\\\',
"'"=>"\'",
'"'=>'\"',
"\x1a"=>'\x1a'
);
return strtr($unescaped,$replacements);
}
// move to the first <job /> node
while ($xml->read() && $xml->name !== 'job');
// now that we're at the right depth, hop to the next <product/> until the end of the tree
while ($xml->name === 'job'){
$node = simplexml_load_string($xml->readOuterXML());
$title = escape_sql($node->{'title'});
$desc = escape_sql($node->{'description'});
$email = escape_sql($node->{'email'});
$url = escape_sql($node->{'url'});
$img = escape_sql($node->{'image'});
$sal = escape_sql($node->{'salary'});
$bens = escape_sql($node->{'benefits'});
$comp = escape_sql($node->{'company'});
$cat = escape_sql($node->{'category'});
$loc = escape_sql($node->{'location'});
// add to the sql query
$sql .= "(".$node->{'jobref'}
.",'"
.$node->{'date'}
."','"
.$title
."','"
.$comp
."','"
.$email
."','"
.$url
."','"
.$node->{'salarymin'}
."','"
.$node->{'salarymax'}
."','"
.$bens
."','"
.$sal
."','"
.$node->{'jobtype'}
."','"
.$node->{'full_part'}
."','"
.$node->{'salary_per'}
."','"
.$loc
."','"
.$node->{'country'}
."','"
.$desc
."','"
.$cat
."','"
.$img
."')";
$count ++;
$total ++;
if($count === $chunks){
mysqli_query($mysqli, $sql) or die(mysqli_error());
$count = 0;
print('<br>inserted : '.$total.'<br>');
$sql = $orig_sql;
}else{
$sql .= ',';
}
$xml->next('job');
}
$xml->close();
mysqli_close($mysqli);
print 'finished inserting data';
exit();
?>