如何浏览网站的CSV文件并使用curl来测试它们是否在线?

时间:2018-12-18 19:06:46

标签: php pdo cron

此cron作业php脚本在我的服务器上找到了csv文件,然后循环访问了该文件上的url。它尝试检查是否通过https或http加载了它,还是通过curl离线了。此卷曲请求可能会占用太多时间。我已经通过ajax通过发布完成了此任务,它完成了工作,但是我需要通过cron任务和一个csv文件来完成此任务。还有其他可能的解决方案吗?

您能找到无法完成任务的原因吗?

任何帮助都会很棒。

function url_test($url){

  $timeout = 20;
  $ch = curl_init();
  curl_setopt ($ch, CURLOPT_HEADER  , true);  // we want headers
  curl_setopt($ch, CURLOPT_NOBODY  , true);  // we don't need body
  curl_setopt ( $ch, CURLOPT_URL, $url );
  curl_setopt ( $ch, CURLOPT_RETURNTRANSFER, 1 );
  curl_setopt ( $ch, CURLOPT_TIMEOUT, $timeout );
  $http_respond = curl_exec($ch);
  $http_respond = trim( strip_tags( $http_respond ) );
  $http_code = curl_getinfo( $ch, CURLINFO_HTTP_CODE );


  if ( ( $http_code == "200" ) || ( $http_code == "301")) {

    return true;
  } else {

    return false;

  }

}

//在每个网址上运行

$offline = 0;
$fullcount = 0;
if (($handle = fopen("/pathtocsv/".$csv, "r")) !== FALSE)
  {
  while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
  {
      $num = count($data);
      for ($c = 0; $c < $num; $c++)
        {

         $https = "https://".$data[$c];
         $https = strtolower($https);

         $http = "http://".$data[$c];
         $http = strtolower($http);

         $http = preg_replace('/\s+/', '', $http);
         $https = preg_replace('/\s+/', '', $https);


      $site = $data[$c];


     if(url_test($https)) 
      { 
        $fullcount++;
          echo $https. " <br>";

          ?>

          <?php
      }
      else if(url_test($http))
      {
        $fullcount++;
          echo $http. " <br>";
          ?>

          <?php
      }else{

           echo $site. " <br>";

          $mysqltime = date("Y-m-d H:i:s", $phptime);
          try
            {
            $conn = new PDO("conn info here);

            // set the PDO error mode to exception

            $conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
            $sql = $conn->prepare($sql);

            $sql = "INSERT INTO table (url,csv,related)
                VALUES ('$site','$csv',1)";

            // use exec() because no results are returned

            $conn->exec($sql);
            echo "New record created successfully";
            }

          catch(PDOException $e)
            {
            echo "Connection failed: " . $e->getMessage();
            }

          }
         curl_close( $ch );

      }

1 个答案:

答案 0 :(得分:0)

您可以使用get_headers()函数。 reference

它将返回类似于以下内容的响应:

Array
(
    [0] => HTTP/1.1 200 OK
    [1] => Date: Sat, 29 May 2004 12:28:13 GMT
    [2] => Server: Apache/1.3.27 (Unix)  (Red-Hat/Linux)
    [3] => Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
    [4] => ETag: "3f80f-1b6-3e1cb03b"
    [5] => Accept-Ranges: bytes
    [6] => Content-Length: 438
    [7] => Connection: close
    [8] => Content-Type: text/html
)

,可用于根据需要进行验证。

关于您正在运行的任务为何未完成的原因。您检查错误日志了吗?