我正在处理推文并从推文中收集网址。
t.com
或twitter.com
开头),则跳过它CODE:
if(preg_match($reg_exUrl, $tweet, $url)) {
preg_match_all($reg_exUrl, $tweet, $urls);
foreach ($urls[0] as $url) {
echo "Tiny url : {$url}<br>";
$full = MyURLDecode($url);
echo "Full url : $full<br>";
if (strpos($full, '//t.co') === true)
continue;
if (strpos($full, '//twitter.com') === true)
continue;
else if (strpos($full, '//bit.ly') !== true)
$full = MyURLDecode($full);
$url_count = get_twitter_url_count($full);
echo "Url count: $url_count";
//echo "Numbers of tweets containing this link : ", $code['count'];
echo "<br>";
}
} else {
echo "Mismatch<br>";
}
function MyURLDecode($url)
{
$ch = @curl_init($url);
@curl_setopt($ch, CURLOPT_HEADER, TRUE);
@curl_setopt($ch, CURLOPT_NOBODY, TRUE);
@curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
@curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$url_resp = @curl_exec($ch);
preg_match('/Location:\s+(.*)\n/i', $url_resp, $i);
if (!isset($i[1]))
{
return $url;
}
return $i[1];
}
function get_twitter_url_count($url) {
$encoded_url = urlencode($url);
$content = @file_get_contents('http://urls.api.twitter.com/1/urls/count.json?url=' . $encoded_url);
return $content ? json_decode($content)->count : 0;
}
问题是:
答案 0 :(得分:1)
对于#1,strpos
将返回找到的文本的起始位置,而不会=== true
,因此您需要进行测试,例如:
strpos($full, '//t.co') !== false
对于#2,尝试在while循环中调用MyURLDecode(),例如:
$previous = $full;
while (($full = MyURLDecode($full)) != $previous) {
$previous = $full;
}