Question

我试图阅读一小段网站代码，http://www.site.com/category

我想要找到的代码片段如下：

<div class="Brands">
    <h2>Search design</h2>
    <div class="columns">
        <div class="column first">
            <div>
                <a href="/category?Brand=flash">flash</a>
                <span>(9)</span>
            </div>
            <div>
                <a href="/category?Brand=bolt">bolt</a>
                <span>(4)</span> And so on...

我想要做的是阅读一个href地址，在此之前将名称放在一个带有2列的表中。
防爆
flash wwwsitecom / category？Brand = flash
bolt wwwsitecom // category？Brand = bolt

我尝试了几种不同的方法，但无法解决它。

<?php
$search = 'columns';
$lines = file('http://www.site.com/category');

// Store true text found
$found = false;
foreach ($lines as $line) {
    if (strpos($line, $search) !== false) {
        $found = true;
        echo $line;
    }
}

// text not found
if (!$found) {
    echo 'No match found';
}
?>

这给了我一个品牌列表，但在每个品牌之后，我希望页面直接链接显示。

我可以添加该功能的任何想法吗？

Answer 1

我按照你开始逐行解析文件的方式，但你必须确保格式不会改变。这应该给你一个关联数组，如（BRAND =＆gt; LINK）。

我使用了explode（）因为你给出的HTML模式并不是那么难，但是如果并非所有链接都遵循这种模式，那么可能需要进行一些调整（/ category？Brand = flash＆amp; key = value可行例如）。

如果它变得更复杂，请看看如何使用正则表达式。

foreach($lines as $line)
{
  if(strpos($line, $search) !== false)
  {
    $found = true;
    $tmp = explode ('<div>', $line); // -> <a href="/category?Brand=flash">flash</a><span>(9)</span></div>
    $count = count ($tmp);
    for ($i = 1; $count - 1; ++$i) {
      $tmp_href = explode ("\"", $tmp[$i]); // -> $tmp_href[1] = wanted href
      $tmp_brand = explode ('=', $tmp_href); // -> $tmp_brand[1] = wanted brand
      $brand_array[$tmp_brand[1]] = 'http://www.site.com' . $tmp_href[1];
    }
  }
}

如果您想要更可靠的方式，或者您要解析大量HTML文件以获取链接，品牌等...您应该尝试找到一个好的库来解析HTML文件。有很多图书馆在做这件事。

读取文件并写入表中的文件

1 个答案: