Question

我想查找字符串中的所有网址（curl结果），然后对这些结果中的任何查询字符串进行编码，例如

网址发现：

http://www.example.com/index.php?favoritecolor=blue&favoritefood=sharwarma

替换所有使用编码字符串找到的URL（我只能做其中一个）

http%3A%2F%2Fwww.example.com%2Findex.php%3Ffavoritecolor%3Dblue%26favoritefood%3Dsharwarma

但是在html curl响应中执行此操作，从html页面中查找所有URL。谢谢你提前，我搜索了几个小时。

Answer 1

如果你的CURL结果是一个HTML页面而你只想要a个链接（而不是图像或其他可点击的元素），这将做你想要的。

$xml = new DOMDocument();

// $html should be your CURL result
$xml->loadHTML($html);

// or you can do that directly by providing the requested page's URL to loadHTMLFile
// $xml->loadHTMLFile("http://...");

// this array will contain all links
$links = array();

// loop through all "a" elements
foreach ($xml->getElementsByTagName("a") as $link) {
    // URL-encodes the link's URL and adds it to the previous array
    $links[] = urlencode($link->getAttribute("href"));
}

// now do whatever you want with that array

$links数组将包含以URL编码格式在页面中找到的所有链接。

编辑：如果你想要保留页面中的所有链接而保留其他所有链接，最好使用DOMDocument而不是正则表达式（相关：why you shouldn't use regex to handle HTML），这是我的代码的编辑版本用其URL编码的等价物替换每个链接，然后将页面保存到变量中：

$xml = new DOMDocument();

// $html should be your CURL result
$xml->loadHTML($html);

// loop through all "a" elements
foreach ($xml->getElementsByTagName("a") as $link) {
    // gets original (non URL-encoded link)
    $original = $link->getAttribute("href");

    // sets new link to URL-encoded format
    $link->setAttribute("href", urlencode($original));
}

// save modified page to a variable
$page = $xml->saveHTML();

// now do whatever you want with that modified page, for example you can "echo" it
echo $page;

基于this 的代码。

Answer 2

不要直接使用php Dom，它会减慢你的执行时间，使用simplehtmldom，它很容易

function decodes($data){
foreach($data->find('a') as $hres){
$bbs=$hres->href;
$hres->__set("href", urlencode($bbs));
}
return $data;
}

查找字符串中的所有URL并编码查询字符串？

2 个答案: