Question

我从一本书中复制了这段代码

<?php
# Initialization
include("LIB_http.php");
include("LIB_parse.php");
$product_array=array();
$product_count=0;

# Download the target (practice store) web page
$target = "http://www.WebbotsSpidersScreenScrapers.com/example_store";
$web_page = http_get($target, "");

# Parse all the tables on the web page into an array
$table_array = parse_array($web_page['FILE'], "<table", "</tables>");

#Look for the the table that contains the product information
for($xx=0; $xx<count($table_array); $xx++)
  {
  $table_landmark = "Products For Sale";
  if(stristr($table_array[$xx], $table_landmark))   // Process this table
    {
    echo "FOUND: Product table\n";


# Parse table into an array of table rows
$product_row_array = parse_array($table_array[$xx], "<tr", "</tr>");
for($table_row=0; $table_row<count($product_row_array); $table_row++)
  {
  # Detect the beginning of the desired data (heading row)
  $heading_landmark = "Condition";
  if((stristr($product_row_array[$table_row], $heading_landmark)))
  {
  echo "FOUND: Talbe heading row\n";

  # Get the position of the desired headings
  $table_cell_array = parse_array($product_row_array[$table_row], "<td", "</td>");
  for($heading_cell=0; $heading_cell<count($table_cell_array); $heading_cell++)
    {
    if(stristr(strip_tags(trim($table_cell_array[$heading_cell])), "ID#"))
      $id_column=$heading_cell;
    if(stristr(strip_tags(trim($table_cell_array[$heading_cell])), "Product name"))
      $name_column=$heading_cell;
    if(stristr(strip_tags(trim($table_cell_array[$heading_cell])), "Price"))
      $price_column=$heading_cell;
    }
  echo "FOUND: id_column=$id_column\n";
  echo "FOUND: price_column=$price_column\n";
  echo "FOUND: name_column=$name_column\n";   

  # Save the heading row for later use

  $heading_row = $table_row;
  }

  #Detect the end of the desired data table
  $ending_landmark = "Calculate";
  if((stristr($product_row_array[$table_row], $ending_landmark)))
    {
    echo "PARSING COMPLETE!\n";
    break;
    }

  # Parse product and price data
  if(isset($heading_row) && $heading_row<$table_row)
    {
    $table_cell_array = parse_array($product_row_array[$table_row], "<td", "</td>");
    $product_array[$product_count]['ID'] = strip_tags(trim($table_cell_array[$id_colum]));
    $product_array[$product_count]['NAME'] = strip_tags(trim($table_cell_array[$name_colum]));
    $product_array[$product_count]['PRICE'] = strip_tags(trim($table_cell_array[$price_colum]));
    $product_count++;
    echo"PROCESSED: Item #$product_count\n";
    }

  #Display the collected data
  for($xx=0; $xx<count($product_array); $xx++)
    {
    echo "$xx. ";
    echo "ID: ".$product_array[$xx]['ID'].", ";
    echo "NAME: ".$product_array[$xx]['NAME'].", ";
    echo "PRICE: ".$product_array[$xx]['PRICE'].", ";
    } 
}
}
}

脚本再次给我没有错误，但它也没有输出任何内容。我不确定我是否需要添加？＆gt;最后与否。这只是我运行的第二个PHP脚本，所以我不确定。

Answer 1

没有回答你的主要问题，我认为Marc B对你说得很好，但既然你提到了，我会补充一下这个结果？＆gt;不需要。事实上，当你有很多文件并且末尾有空行时，它会导致“已发送标题”问题。

Answer 2

此代码来自一本名为

的书

Webbots，Spiders和Screen Scrapers：Michael Schrenk使用PHP / CURL开发Internet代理的指南

第一次代码对我不起作用

检查代码后，我发现目标地址已被更改

替换

$ target =“http://www.WebbotsSpidersScreenScrapers.com/example_store”;

与

$ target =“http://www.webbotsspidersscreenscrapers.com/buyair”;

PHP脚本没有运行也没有给出错误消息

2 个答案: