美好的一天!
我想帮助删除方括号内的字符串并包括方括号。
字符串如下所示:
$string = "Lorem ipsum dolor<br /> [ Context are found on www.example.com ] <br />some text here. Text here. [test] Lorem ipsum dolor.";
我只想删除包含“www.example.com”的括号及其内容。我希望在字符串中保留"[test]"
,其他任何括号中都没有"www.example.com"
。
谢谢!
答案 0 :(得分:3)
注意: OP已经大大改变了这个问题。此解决方案旨在以原始(更难)形式处理问题(在添加“www.example.com”约束之前)。虽然以下解决方案已经过修改以处理此附加约束,但现在可能更简单的解决方案足够(即anubhava的答案)。
这是我测试过的解决方案:
function strip_bracketed_special($text) {
$re = '% # Remove bracketed text having "www.example.com" within markup.
# Skip comments, CDATA, SCRIPT & STYLE elements, and HTML tags.
( # $1: HTML stuff to be left alone.
<!--.*?--> # HTML comments (non-SGML compliant).
| <!\[CDATA\[.*?\]\]> # CDATA sections
| <script.*?</script> # SCRIPT elements.
| <style.*?</style> # STYLE elements.
| <\w+ # HTML element start tags.
(?: # Group optional attributes.
\s+ # Attributes separated by whitespace.
[\w:.-]+ # Attribute name is required
(?: # Group for optional attribute value.
\s*=\s* # Name and value separated by "="
(?: # Group for value alternatives.
"[^"]*" # Either double quoted string,
| \'[^\']*\' # or single quoted string,
| [\w:.-]+ # or un-quoted string (limited chars).
) # End group of value alternatives.
)? # Attribute values are optional.
)* # Zero or more start tag attributes.
\s*/?> # End of start tag (optional self-close).
| </\w+> # HTML element end tags.
) # End #1: HTML Stuff to be left alone.
| # Or... Bracketed structures containing www.example.com
\s*\[ # (optional ws), Opening bracket.
[^\]]*? # Match up to required content.
www\.example\.com # Required bracketed content.
[^\]]* # Match up to closing bracket.
\]\s* # Closing bracket, (optional ws).
%six';
return preg_replace($re, '$1', $text);
}
请注意,正则表达式会跳过从内部删除括号内的材料:HTML注释,CDATA部分,SCRIPT和STYLE元素以及HTML标记属性值。给定以下XHTML标记(测试这些场景),上面的函数正确地删除了html元素内容中的括号内容:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>Test special removal. [Remove this www.example.com]</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<style type="text/css">
.test.before {
content: "[Do not remove www.example.com]";
}
</style>
<script type="text/javascript">
// <![CDATA[ ["Do not remove www.example.com"] ]]>
var ob = {};
ob["Do not remove www.example.com"] = "stuff";
var str = "[Do not remove www.example.com]";
</script>
</head>
<body>
<!-- <![CDATA[ ["Do not remove www.example.com"] ]]> -->
<div title="[Do not remove www.example.com]">
<h1>Test special removal. [Remove this www.example.com]</h1>
<p>Test special removal. [Remove this www.example.com]</p>
<p onclick='var str = "[Do not remove www.example.com]"; return false;'>
Test special removal. [Do not remove this]
Test special removal. [Remove this www.example.com]
</p>
</div>
</body>
</html>
通过上面的PHP函数运行后,这是相同的标记:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>Test special removal.</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<style type="text/css">
.test.before {
content: "[Do not remove www.example.com]";
}
</style>
<script type="text/javascript">
// <![CDATA[ ["Do not remove www.example.com"] ]]>
var ob = {};
ob["Do not remove www.example.com"] = "stuff";
var str = "[Do not remove www.example.com]";
</script>
</head>
<body>
<!-- <![CDATA[ ["Do not remove www.example.com"] ]]> -->
<div title="[Do not remove www.example.com]">
<h1>Test special removal.</h1>
<p>Test special removal.</p>
<p onclick='var str = "[Do not remove www.example.com]"; return false;'>
Test special removal. [Do not remove this]
Test special removal.</p>
</div>
</body>
</html>
这个解决方案应该可以很好地处理你可以抛出的任何有效(X)HTML。 (但请,没有时髦的shorttags或SGML comments!)
答案 1 :(得分:1)
$str = "Lorem ipsum dolor<br /> [ Context are found on www.example.com ] <br />some text here. Text here. [test] Lorem ipsum dolor.";
$str = preg_replace('~\[[^]]*?www\.example\.com[^]]*\]~si', "", $str);
var_dump($str);
string(83) "Lorem ipsum dolor<br /> <br />some text here. Text here. [test] Lorem ipsum dolor."
PS:它可以在多行中断行。
答案 2 :(得分:0)
使用类似/\[.*?\]/
的正则表达式。反斜杠是必要的,否则它会尝试匹配任何单个字符.
,*
或?
。
答案 3 :(得分:0)
我能想到的最简单的方法是使用正则表达式来计算[
和]
之间的所有内容,然后将其替换为""
。下面的代码将替换您在示例中使用的字符串。如果需要删除的实际字符串更复杂,则可以更改正则表达式以匹配。我建议使用regexpal.com来测试正则表达式。
$string = preg_replace("\[[A-Za-z .]*\]","",$string);
答案 4 :(得分:0)
以下代码会将<br/>
更改为换行符:
$str = "Lorem ipsum dolor<br />[ Context are found on www.example.com ] <br />some text here";
$str = preg_replace( "/\[[^\]]*\]/m", "", $str);
echo $str;
输出:
Lorem ipsum dolor
这里的一些文字