我正在尝试将某个标签内的括号替换为标签外部,即如果标签后面有一个左括号或紧接在结束标签之前有一个右括号。例如:
<italic>(When a parenthetical sentence stands on its own)</italic>
<italic>(When a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its own)</italic>
这些行应该在替换之后:
(<italic>When a parenthetical sentence stands on its own</italic>)
(<italic>When a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its own</italic>)
但是,下面三个字符串应保持不变。
<italic>(When) a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its (own)</italic>
<italic>When a parenthetical sentence stands (on) its own</italic>
但是以下字符串:
<italic>((When) a parenthetical sentence stands on its own</italic>
<italic>((When) a parenthetical sentence stands on its own)</italic>
<italic>(When) a parenthetical sentence stands on its own)</italic>
<italic>When a parenthetical sentence stands on its (own))</italic>
<italic>(When a parenthetical sentence stands on its (own)</italic>
应该在替换之后:
(<italic>(When) a parenthetical sentence stands on its own</italic>
(<italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>When a parenthetical sentence stands on its (own)</italic>)
(<italic>When a parenthetical sentence stands on its (own)</italic>
<italic>...</italic>
标记内可能有嵌套标记,一行可以包含多个<italic>...</italic>
字符串。
此外,如果<inline-formula>...</inline-formula>
中有嵌套标记<italic>...</italic>
,则应忽略这些标记。
我可以使用正则表达式吗?如果没有其他方式可以做到这一点?
我的方法就是这个(我仍然不确定它是否涵盖了所有可能的情况):
第一步:<italic>( ---> (<italic>
找到<italic>
(如果标签后面没有一对匹配的括号,后面跟着一个结束标签
匹配仅允许在一行内。
查找内容:(<(italic)>)(?!(\((?>(?:(?![()\r\n]).)++|(?3))*+\))(?!</$2\b))(\()
替换为:$4$1
第二步:)</italic> ---> </italic>)
如果标签前面没有一对匹配的括号,前面没有开头标记,请找)</italic>
匹配仅允许在一行内。
(\))(?<!(?<!<(italic)>)(\((?>(?:(?![()\r\n]).)++|(?3))*+\)))(</2\b>)
答案 0 :(得分:1)
你可以通过几种不同的方式做到这一点,我首先要确定标签何时可以替换。
这个问题似乎适用于解析器方法并跟踪括号状态(标记文本开头是否有括号,以及嵌套是当前点的括号)。编写解析器可以让我们以建设性的方式进行替换,而不是使用正则表达式进行搜索,并且替换子字符串并且自然会递归处理嵌套。使用正则表达式执行此操作似乎有点复杂。这就是我想出来的。
using System;
using System.IO;
using System.Text;
namespace ParenParser {
public class Program
{
public static Stream GenerateStreamFromString(string s)
{
MemoryStream stream = new MemoryStream();
StreamWriter writer = new StreamWriter(stream);
writer.Write(s);
writer.Flush();
stream.Position = 0;
return stream;
}
public static String Process(StreamReader s) { // root
StringBuilder output = new StringBuilder();
while (!s.EndOfStream) {
var ch = Convert.ToChar(s.Read());
if (ch == '<') {
output.Append(ProcessTag(s, true));
} else {
output.Append(ch);
}
}
return output.ToString();
}
public static String ProcessTag(StreamReader s, bool skipOpeningBracket = true) {
int currentParenDepth = 0;
StringBuilder openingTag = new StringBuilder(), allTagText = new StringBuilder(), closingTag = new StringBuilder();
bool inOpeningTag = false, inClosingTag = false;
if (skipOpeningBracket) {
inOpeningTag = true;
openingTag.Append('<');
skipOpeningBracket = false;
}
while (!s.EndOfStream) {
var ch = Convert.ToChar(s.Read());
if (ch == '<') { // start of a tag
var nextCh = Convert.ToChar(s.Peek());
if (nextCh == '/') { // closing tag!
closingTag.Append(ch);
inClosingTag = true;
} else if (openingTag.ToString().Length != 0) { // already seen a tag, recurse
allTagText.Append(ProcessTag(s, true));
continue;
} else {
openingTag.Append(ch);
inOpeningTag = true;
}
}
else if (inOpeningTag) {
openingTag.Append(ch);
if (ch == '>') {
inOpeningTag = false;
}
}
else if (inClosingTag) {
closingTag.Append(ch);
if (ch == '>') {
// Done!
var allTagTextString = allTagText.ToString();
if (allTagTextString.Length > 0 && allTagTextString[0] == '(' && allTagTextString[allTagTextString.Length - 1] == ')' && currentParenDepth == 0) {
return "(" + openingTag.ToString() + allTagTextString.Substring(1, allTagTextString.Length - 2) + closingTag.ToString() + ")";
} else if (allTagTextString.Length > 0 && allTagTextString[0] == '(' && currentParenDepth > 0) { // unclosed
return "(" + openingTag.ToString() + allTagTextString.Substring(1, allTagTextString.Length - 1) + closingTag.ToString();
} else if (allTagTextString.Length > 0 && allTagTextString[allTagTextString.Length - 1] == ')' && currentParenDepth < 0) { // unopened
return openingTag.ToString() + allTagTextString.Substring(0, allTagTextString.Length - 1) + closingTag.ToString() + ")";
} else {
return openingTag.ToString() + allTagTextString + closingTag.ToString();
}
}
}
else
{
allTagText.Append(ch);
if (ch == '(') {
currentParenDepth++;
}
else if (ch == ')') {
currentParenDepth--;
}
}
}
return openingTag.ToString() + allTagText.ToString() + closingTag.ToString();
}
public static void Main()
{
var testCases = new String[] {
// Should change
"<italic>(When a parenthetical sentence stands on its own)</italic>",
"<italic>(When a parenthetical sentence stands on its own</italic>",
"<italic>When a parenthetical sentence stands on its own)</italic>",
// Should remain unchanged
"<italic>(When) a parenthetical sentence stands on its own</italic>",
"<italic>When a parenthetical sentence stands on its (own)</italic>",
"<italic>When a parenthetical sentence stands (on) its own</italic>",
// Should be changed
"<italic>((When) a parenthetical sentence stands on its own</italic>",
"<italic>((When) a parenthetical sentence stands on its own)</italic>",
"<italic>(When) a parenthetical sentence stands on its own)</italic>",
"<italic>When a parenthetical sentence stands on its (own))</italic>",
"<italic>(When a parenthetical sentence stands on its (own)</italic>",
// Other cases
"<italic>(Try This on!)</italic>",
"<italic><italic>(Try This on!)</italic></italic>",
"<italic></italic>",
"",
"()",
"<italic>()</italic>",
"<italic>"
};
foreach(var testCase in testCases) {
using(var testCaseStreamReader = new StreamReader(GenerateStreamFromString(testCase))) {
Console.WriteLine(testCase + " --> " + Process(testCaseStreamReader));
}
}
}
}
}
测试用例结果类似于
<italic>(When a parenthetical sentence stands on its own</italic> --> (<italic>When a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its own)</italic> --> <italic>When a parenthetical sentence stands on its own</italic>)
<italic>(When) a parenthetical sentence stands on its own</italic> --> <italic>(When) a parenthetical sentence stands on its own</italic>
<italic>When a parenthetical sentence stands on its (own)</italic> --> <italic>When a parenthetical sentence stands on its (own)</italic>
<italic>When a parenthetical sentence stands (on) its own</italic> --> <italic>When a parenthetical sentence stands (on) its own</italic>
<italic>((When) a parenthetical sentence stands on its own</italic> --> (<italic>(When) a parenthetical sentence stands on its own</italic>
<italic>((When) a parenthetical sentence stands on its own)</italic> --> (<italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>(When) a parenthetical sentence stands on its own)</italic> --> <italic>(When) a parenthetical sentence stands on its own</italic>)
<italic>When a parenthetical sentence stands on its (own))</italic> --> <italic>When a parenthetical sentence stands on its (own)</italic>)
<italic>(When a parenthetical sentence stands on its (own)</italic> --> (<italic>When a parenthetical sentence stands on its (own)</italic>
<italic>(Try This on!)</italic> --> (<italic>Try This on!</italic>)
<italic><italic>(Try This on!)</italic></italic> --> (<italic><italic>Try This on!</italic></italic>)
<italic></italic> --> <italic></italic>
-->
() --> ()
<italic>()</italic> --> (<italic></italic>)
<italic> --> <italic>