我正在尝试将数据从网站保存到mysql数据库中。我能够保存我想保存的大部分内容,但我有一个特别的问题。我提取的链接正在保存,但我希望链接与其他属性位于同一行.Below是我的CURL和mysql查询,用于提取信息并将其保存到数据库中。
$target_url = "http://www.ucc.ie/modules/descriptions/BM.html";
$codeS = "BM";
$html = file_get_contents("http://www.ucc.ie/modules/descriptions/BM.html");
@$doc = new DomDocument();
@$doc->loadHtml($html);
//discard white space
@$doc->preserveWhiteSpace = false;
$xpath = new DomXPath($doc);
//Read through dd tags
$options = $doc->getElementsByTagName('dd');
//Go into dd tags and look for all the links with class modnav
$links = $xpath->query('//dd //a[@class = "modnav"]');
//Loop through and display the results for links
foreach($links as $link){
echo $link->getAttribute('href'), '<br><br>';
}
foreach ($options as $option) {
$option->nodeValue;
echo "Node Value (Module name/title)= $option->nodeValue <br /><br /> <br />";
// save both for each results into database
$query3 = sprintf("INSERT INTO all_modulenames(code,module_name,description_link,gathered_from)
VALUES ('%s','%s','%s','%s')",
mysql_real_escape_string ($codeS),
mysql_real_escape_string($option->nodeValue),
mysql_real_escape_string($link->getAttribute('href')),
mysql_real_escape_string($target_url));
mysql_query($query3) or die(mysql_error()."<br />".$query3);
}
echo "<br /> <br /> <br />";
Here is the table
-- ----------------------------
-- Table structure for `all_modulenames`
-- ----------------------------
DROP TABLE IF EXISTS `all_modulenames`;
CREATE TABLE `all_modulenames_copy` (
`code` varchar(255) NOT NULL,
`module_name` varchar(255) NOT NULL,
`description_link` varchar(255) NOT NULL,
`gathered_from` varchar(255) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
-- ----------------------------
-- Records of all_modulenames
-- ----------------------------
所以问题是“$ link-&gt; getAttribute('href')”与我试图保存的其他内容分开保存。首先保存链接,然后是其余数据,然后将一些行留空但我试图一次性保存所有内容,即填充每一行,然后移动到第二行,直到每个语句完成。我怎么能这样做?任何帮助将不胜感激!!
答案 0 :(得分:1)
未经测试(因此需要调试)但我会接近这样的事情:
...etc
@$doc->preserveWhiteSpace = false;
//Read through dd tags
$options = $doc->getElementsByTagName('dd');
foreach ($options as $option) {
// Get the links and find the one with the right class
$href = '';
$links = $option->getElementsByTagName('a');
foreach ($link as $link) {
if ($link->hasAttribute('class') && $link->hasAttribute('href')) {
$aClasses = explode(' ', $link->getAttribute('class'));
if (in_array('modnav', $aClasses)) {
$href=$link->getAttribute('href');
}
}
}
Insert in to SQL etc, $href is the link text belonging to the dd ...