Question

我想在html内容中获取非十六进制或unicode标签之间的数字像这样使用正则表达式

<a href="/sam2/example-3.php">go to page 13</a> 0x91 0x26 exchange hello98.25 &#8230;

返回

13和98.25

Answer 1

我看不到你想要做什么的任何用途...如果你想创建分页，你应该使用GET或POST数据，例如：

 <a href="/sam2/example-3.php?page=13">go to page 13</a>

然后您可以检索页面值并在脚本中使用它

 $page = $_GET['page'];

但无论如何，回答你的问题：

$content = '<a href="/sam2/example-3.php">go to page 13</a> 0x91 0x26 exchange hello98.25 &#8230;';

$page_id = preg_replace('/(\"(.*)\"|0x.[0-9]+|\&\#.[0-9]+|[^0-9\.])/', ' ', $content);

echo $page_id;

//Result: 13 98.25 (string with each number separated by space)

祝你好运。

Answer 2

我终于写了我的正则表达式

'/(?:&#\d{2,4};)|(?:0[xX][0-9a-fA-F]+)|(\d+[\.\d]*)|<\s*[^>]+>/i'

完美的工作

只获得标签之间的数字

2 个答案: