如何使用PHP提交表单并从另一个站点检索文件?

时间:2010-11-05 23:28:44

标签: php post download webforms

我需要每天从http://www.carriersoftwaredata.com/Login.aspx?file=FHWA下载CSV文件到我的本地服务器。我的计划是实现一个运行php脚本的cron作业来执行此操作。但是,在允许下载文件之前,页面需要输入用户名和密码。如果我在网络浏览器中访问此页面,请输入用户名和密码,然后提交表单,然后我的浏览器将显示“下载”对话框并开始下载文件。

如何使用PHP提交表单并下载提供的文件?

这就是我目前正在做的事情,以获取必要的$ _POST信息。

//url for fhwa db file
$fhwa_url = 'http://www.carriersoftwaredata.com/Login.aspx?file=FHWA';

//get the contents of the fhwa page
$fhwa_login = file_get_contents($fhwa_url);

//load contents of fhwa page into a DOMDocument object
$fhwa_dom = new DOMDocument;
if (!$fhwa_dom->loadhtml($fhwa_login))
{
   echo 'Could not Load html for FHWA Login page.';
}
else
{
   //create a post array to send back to the server - keys relate to the name of the input
   $fhwa_post_items = array(
      '__VIEWSTATE'=>'',
      'Email'=>'',
      'Password'=>'',
      '__EVENTVALIDATION'=>'',
   );

   //create an xpath object
   $xpath = new DOMXpath($fhwa_dom);

   //iterate through the form1 form and find all inputs
   foreach($xpath->query('//form[@name="form1"]//input') as $input)
   {
      //get name and value of input
      $input_name = $input->getAttribute('name');
      $input_value = $input->getAttribute('value');

      //check if input name matches a key in the post array
      if(array_key_exists($input_name, $fhwa_post_items))
      {
         //if the input name is Email or Password enter the defined email and password
         switch($input_name)
         {
            case 'Email':
               $input_value = $email;
               break;
            case 'Password':
               $input_value = $pass;
               break;
         }//switch

         //assign value to post array
         $fhwa_post[$input_name] = $input_value;
      }// if
   }// foreach
}// if

这就是我尝试提交表单的方式 - 但它似乎并不像我需要的那样工作。我想将stream_get_contents返回的内容作为我要下载的CSV文件的内容。

   //get the url data and open a connection to the page
   $url_data = parse_url($fhwa_url);
   $post_str = http_build_query($fhwa_post);

   //create socket
   $fp = @fsockopen($url_data['host'], 80, $errno, $errstr, 30);
   fputs($fp, "POST $fhwa_url HTTP/1.0\r\n");
   fputs($fp, "Host: {$url_data['host']}\r\n");
   fputs($fp, "User-Agent: Mozilla/4.5 [en]\r\n");
   fputs($fp, "Content-Type: application/x-www-form-urlencoded\r\n");
   fputs($fp, "Content-Length: ".strlen($post_str)."\r\n");
   fputs($fp, "\r\n");
   fputs($fp, $post_str."\r\n\r\n");
   echo stream_get_contents($fp);
   fclose($fp);

绝对赞赏任何帮助。

3 个答案:

答案 0 :(得分:1)

您似乎需要使用cURL库:http://www.php.net/manual/en/book.curl.php

以下是他们的网站使用curl使用post数据的示例(POST表单数据是数组$ data):

 <?php

/* http://localhost/upload.php:
print_r($_POST);
print_r($_FILES);
*/

$ch = curl_init();

$data = array('name' => 'Foo', 'file' => '@/home/user/test.png');

curl_setopt($ch, CURLOPT_URL, 'http://localhost/upload.php');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);

curl_exec($ch);
?>

答案 1 :(得分:1)

我必须采用的方法是使用curl。如果服务器正在加载,您可以通过HTTP Basic Auth传递请求。之后,您应该能够获取从curl请求中检索到的数据,并使用php fwrite函数将其保存到本地计算机。

这是我用过的一个例子。我剥离了我的奇特逻辑,只是基础知识。它也没有Basic Auth的代码,但您应该能够轻松地在Google中找到它。我认为这可能会比较容易,除非你有其他的东西需要我去看看。

// Download data
$ch = curl_init('http://www.server.com/file.txt');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$rawdata = curl_exec($ch);
curl_close ($ch);

// Save data
$fp = fopen('new/file/location.txt','w');
fwrite($fp, $rawdata);
fclose($fp);

答案 2 :(得分:0)

我发现我最初的工作正在进行,但我根本没有发送足够的帖子信息。我写了一个函数来帮助我在将来轻松地做到这一点。

/**
 * @author: jeremysawesome - www.dnawebagency.com
 * @description: save_remote_file - Requires a url, a post array and a
 *    filename. This function will send the post_array to the remote server
 *    and retrieve the remote file. The retrieved remote file will be
 *    saved to the filename specified by the filename parameter.
 *    Note: This function should only be used when specific post information
 *       must be submitted to the remote server before the file download
 *       will begin.
 *       For Example: http://www.carriersoftwaredata.com/Login.aspx?file=FHWA
 * @param: post_url - the url to send a post request to
 * @param: post_array - the post arguments to send to the remote server.
 * @param: filename - the name to save the retrieved remote file as
**/
function save_remote_file($post_url, $post_array, $filename)
{
   //get the url data
   $url_data = parse_url($post_url);
   $post_str = http_build_query($post_array);

   //build the headers to send to the form
   $headers = "POST $post_url HTTP/1.0\r\n";
   $headers .= "Host: {$url_data['host']}\r\n";
   $headers .= "User-Agent: Mozilla/4.5 [en]\r\n";
   $headers .= "Content-Type: application/x-www-form-urlencoded\r\n";
   $headers .= "Content-Length: ".strlen($post_str)."\r\n";
   $headers .= "\r\n";
   $headers .= $post_str."\r\n\r\n";

   //create socket and download data
   $fp = @fsockopen($url_data['host'], 80, $errno, $errstr, 30);
   fputs($fp, $headers);
   $remote_file = stream_get_contents($fp);
   fclose($fp);

   //save data
   $saved_file = fopen($filename,'w') or die('Cannot Open File '. $filename);
   fwrite($saved_file, $remote_file);
   fclose($saved_file);
}// save_remote_file

对于那些关心 - 或者将来需要做类似事情的人,这里是完整的代码(没有上述功能)。


<?php
/**
 * @author: jeremysawesome - www.dnawebagency.com
 * @desciption: This file retrieves the database information from Carrier Software.
 *    This should be run with a cron in order to download the files. The file works
 *    by getting the contents of the Carrier Software login page for the database. An
 *    array of post values is created based on the contents of the login page and the
 *    defined username and password. A socket is opened and the post information is
 *    passed to the login page.
**/

//define username and pass
$user_info = array(
   'user'=>'[USERNAME]',
   'pass'=>'[PASSWORD]'
);

//url for fhwa db file
$fhwa_url = 'http://www.carriersoftwaredata.com/Login.aspx?file=FHWA';

//get the contents of the fhwa page
$fhwa_login = file_get_contents($fhwa_url);

//load contents of fhwa page into a DOMDocument object
$fhwa_dom = new DOMDocument;
if (!$fhwa_dom->loadhtml($fhwa_login))
{
   die('Could not Load html for FHWA Login page.');
}
else
{
   //create a post array to send back to the server - keys relate to the name of the input - this allows us to retrieve the randomly generated values of hidden inputs
   $fhwa_post_items = array(
      '__EVENTTARGET' => '',
      '__EVENTARGUMENT' => '',
      '__VIEWSTATE' => '',
      'Email' => '',
      'Password' => '',
      'btnSubmit' => 'Submit',
      '__EVENTVALIDATION' => '',
   );

   //create an xpath object
   $xpath = new DOMXpath($fhwa_dom);

   //iterate through the form1 form and find all inputs
   foreach($xpath->query('//form[@name="form1"]//input') as $input)
   {
      //get name and value of input
      $input_name = $input->getAttribute('name');
      $input_value = $input->getAttribute('value');

      //check if input name matches a key in the post array
      if(array_key_exists($input_name, $fhwa_post_items))
      {
         //if the input name is Email or Password enter the defined email and password
         switch($input_name)
         {
            case 'Email':
               $input_value = $user_info['user'];
               break;
            case 'Password':
               $input_value = $user_info['pass'];
               break;
         }//switch

         //assign value to post array
         $fhwa_post[$input_name] = $input_value;
      }// if
   }// foreach

   //save the file - function shown above
   save_remote_file($fhwa_url, $fhwa_post, "my_data_folder/my_fhwa-db.zip");
}// if