使file_get_content更快并跳过404或不存在的页面

时间:2016-02-10 00:58:56

标签: php json

目前,该过程需要一段时间才能运行,但由于某些文件会随着时间的推移而被删除,因此也会遇到一些404错误。我需要尝试使这个运行更快,并让它跳过所有404或不存在的页面。现在,当它跳过页面时。它将生成一个空白值的网址

  

https://widget.mcf.li/project/.json

我需要它来制作任何东西。

    $ch = curl_init();

    // set url
    curl_setopt($ch, CURLOPT_URL, "bot.notenoughmods.com/1.8.9.json");

    //return the transfer as a string
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');

    // $output contains the output string
  $json = json_decode(curl_exec($ch), true);

//Get itesm from NEM API
foreach($json as $item) {
    $link = $item['longurl'];

    //Process for FQDN
    $link = preg_replace("/htt.{1,2}:\/\/(.+?[\.\-])*(\w{1,61}\.[a-zA-Z]{2,})\/.*/i", "$2", $link);
    //If its curse.com, process normally
    if ($link == 'curse.com') {
        $url = $item['longurl'];
        $newurl = explode("http://www.curse.com", $url);
        echo 'https://widget.mcf.li' . $newurl[1] . '.json <br />';
    } 

//If its curseforge.com, process HTML
    if ($link == 'curseforge.com') {
    $html = @file_get_contents($item['longurl']);
    //Check if URL gives 404
    if (empty($html)){
        return;
    } else {
        preg_match('%<li class="view-on-curse">\s+<a href="http:\/\/curse\.com\/project\/(?P<id>.*)">\s+View on Curse\.com\s+<\/a>\s+<\/li>%', $html, $matches);
        echo 'https://widget.mcf.li/project/' . $matches['id'] . '.json <br />';
} 
}
}

0 个答案:

没有答案