Question

我在数百个页面上处理以下HTML布局，我想从中抓取数据：

<style>
    div {
        border: 2px solid #000;
        margin-bottom: 15px;
    }
</style>

<?PHP
$result = mysql_query("SELECT * FROM m1 
      WHERE mevent='Hiking' AND maffirm='Yes'
      ORDER BY mdate DESC")
or die(mysql_error());

echo "<div>";
$currentDate = false;
while($row = mysql_fetch_array($result))
{
    if ($row['mdate'] != $currentDate){
        echo '<strong>' . $row['mdate'] . '</strong>' ;
        $currentDate = $row['mdate'];
    }
    echo '<ul><li>' . $row['mname'] . '</li></ul>';
}
echo "</div>";

?>

以下XPath将提取'Russell＆amp; Bromley'，但它也提取了我不想要的空白字符：

// * [@ ID = “stores_list”] / DIV [2] / DIV / DIV [1] / P

我如何使用带有上述XPath的normalize-space函数来删除空格？

Microsoft支持文档：

https://msdn.microsoft.com/en-us/library/ms256063(v=vs.110).aspx

示例字符串：

normalize-space（“abc def”）

<div class="store stores_show cms_page_text">
      <div class="row">
        <div class="col col_4 m_col_8 stores_list_address">
          <p class="store_header">
    Russell & Bromley                       
          </p>
    Unit 3A
         <br/>  
    35-38 George Street<br/>
                TW9 1HY                                     
      </div>
      <div class="col col_4 m_col_8 stores_list_contact">
      <strong>T.</strong>         02089486805<br/>                                                          </div>

我无法让自己的XPath，任何想法？

如果您需要更多信息，请与我们联系。我想避免第二步，例如通过Excel消除空白字符。

非常感谢，完全超过我的头脑，这是一个有0经验的新手。

Answer 1

你可以试试这个：

使用xpath提取值。
将值保存在字符串中。
使用normalize-space函数删除前导和尾随空格

Answer 2

尝试

normalize-space(//*[@id="stores_list"]/div[2]/div/div[1]/p)

使用带有XPath的normalize-space来删除空格

2 个答案: