Question

我正在尝试删除网页地址字符串中包含“.html”之后的所有内容。当前（失败）代码是：

$input = 'http://example.com/somepage.html?foo=bar&baz=x';
$result = preg_replace("/(.html)[^.html]+$/i",'',$input);

期望的结果：

value of $result is 'http://example.com/somepage'

$ input的其他一些示例应该导致相同的值$ result：

http://example.com/somepage
http://example.com/somepage.html
http://example.com/somepage.html?url=http://example.com/index.html

Answer 1

您的常规表达错误，只会匹配以<one char> "html" <one or more chars matching ., h, t, m or l>结尾的字符串。由于preg_replace只是在没有匹配的情况下返回字符串“as-is”，因此您可以匹配文字.html并忽略其后的任何内容：

$result = preg_replace('/\.html.*/', '', $input);

Answer 2

为什么不使用parse_url？

Answer 3

如果您遇到preg_replace（）的语法问题，那么您也可以使用explode（）：

$input = explode(".html", $input);
$result = $input[0];

PHP删除Web地址字符串中包含.html之后的所有内容

3 个答案: