我正在尝试检测字符串中包含斜杠(如:http://example.com/posts/30)的网址,并将其作为html标记获取,如:
<a href="http://example.com/posts/30">http://example.com/posts/30</a>
现在,我有一个功能,这样做,但链接与斜杠不起作用,我得到一个空链接名称,其他链接工作完美(如:http://example.com/page.php?id=1),这是我的职能:
//get links in string
function makeLinks($str) {
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
$urls = array();
$urlsToReplace = array();
if(preg_match_all($reg_exUrl, $str, $urls)) {
$numOfMatches = count($urls[0]);
$numOfUrlsToReplace = 0;
for($i=0; $i<$numOfMatches; $i++) {
$alreadyAdded = false;
$numOfUrlsToReplace = count($urlsToReplace);
for($j=0; $j<$numOfUrlsToReplace; $j++) {
if($urlsToReplace[$j] == $urls[0][$i]) {
$alreadyAdded = true;
}
}
if(!$alreadyAdded) {
array_push($urlsToReplace, $urls[0][$i]);
}
}
$numOfUrlsToReplace = count($urlsToReplace);
for($i=0; $i<$numOfUrlsToReplace; $i++) {
$str = str_replace($urlsToReplace[$i], "<div class=\"dont-break-out\"><a href=\"".$urlsToReplace[$i]."\" class=\"titled_url\" rel=\"nofollow\" target=\"_blank\">".get_title($urlsToReplace[$i])."</a></div> ", $str);
}
return $str;
} else {
return $str;
}
}
这是获取网址标题的函数:
//get link title
function get_title($url){
$str = file_get_contents_utf8($url);
if(strlen($str)>0){
$str = trim(preg_replace('/\s+/', ' ', $str)); // supports line breaks inside <title>
preg_match("/\<title\>(.*)\<\/title\>/i",$str,$title); // ignore case
$title_trimmed=trim($title[1]);
if(!empty($title_trimmed)){
return $title[1];
}else{
return $url;
}
} } 这是utf-8 file_get_contents:
function file_get_contents_utf8($fn) {
$content = file_get_contents($fn);
return mb_convert_encoding($content, 'UTF-8',
mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}
有什么帮助吗?