需要帮助访问PHP DOM元素

时间:2010-05-16 08:20:45

标签: php dom xpath

嘿伙计们,我有以下HTML结构,我试图从中获取信息:

// Product 1
<div class="productName">
 <span id="product-name-1">Product Name 1</span>
</div>

<div class="productDetail">            
 <span class="warehouse">Warehouse 1, ACT</span>                
 <span class="quantityInStock">25</span>
</div>

// Product 2
<div class="productName">
 <span id="product-name-2">Product Name 2</span>
</div>

<div class="productDetail">            
 <span class="warehouse">Warehouse 2, ACT</span>                
 <span class="quantityInStock">25</span>
</div>

…

// Product X
<div class="productName">
 <span id="product-name-X">Product Name X</span>
</div>

<div class="productDetail">            
 <span class="warehouse">Warehouse X, ACT</span>                
 <span class="quantityInStock">25</span>
</div>

我无法控制源html,因为你会看到productName,它附带的productDetail不包含在一个公共元素中。

现在,我使用以下php代码尝试解析页面。

$html = new DOMDocument();
$html->loadHtmlFile('product_test.html');

$xPath = new DOMXPath($html);

$domQuery = '//div[@class="productName"]|//div[@class="productDetail"]';

$entries = $xPath->query($domQuery);

foreach ($entries as $entry) { 
 echo "Detail: " . $entry->nodeValue) . "<br />\n";
}

打印以下内容:

Detail: Product Name 1
Detail: Warehouse 1, ACT
Detail: 25
Detail: Product Name 2
Detail: Warehouse 2, ACT
Detail: 25
Detail: Product Name X
Detail: Warehouse X, ACT
Detail: 25

现在,这接近我想要的。但我需要对每个产品,仓库和数量库存进行一些处理,并且无法弄清楚如何将其解析为单独的产品组。我追求的最终结果是:

Product 1:
Name: Product Name 1
Warehouse: Warehouse 1, ACT
Stock: 25

Product 2:
Name: Product Name 2
Warehouse: Warehouse 2, ACT
Stock: 25 

我不能弄明白这一点,而且我无法将这些东西包裹起来,因为这些元素与标准数组的工作方式不同。

如果有人可以提供帮助,或指出我正确的方向,我将永远感激。

1 个答案:

答案 0 :(得分:0)

也许不是最有效的方式,但

$html = new DOMDocument();
$html->loadHtmlFile('test2.php');

$xPath = new DOMXPath($html);

foreach( $xPath->query('//div[@class="productName"]') as $prodName ) { 
  $prodDetail = $xPath->query('following-sibling::div[@class="productDetail"][1]', $prodName);
  // <-- todo: test if there is one item here -->
  $prodDetail = $prodDetail->item(0);
  echo "Name: " . $prodName->nodeValue . "<br />\n";
  echo "Detail: " . $prodDetail->nodeValue . "<br />\n";
  echo "----\n";
}

打印

Name: 
 Product Name 1
<br />
Detail:             
 Warehouse 1, ACT                
 25
<br />
----
Name: 
 Product Name 2
<br />
Detail:             
 Warehouse 2, ACT                
 25
<br />
----
Name: 
 Product Name X
<br />
Detail:             
 Warehouse X, ACT                
 25
<br />
----