我需要制作这样的文字,例如
Founded in 2008, Stack Overflow sees 40 million visitors each month
|| <b>ID</b> || <b>Column1</b> || <b>Column2</b> ||
| | | |
Stack Overflow Documentation, the largest content expansion since Q&A, launches in July
|| <b>Name</b> || <u>Surname</u> || <u>DoB</u> ||
| | | |
The Developer Story launches in October, giving developers a better way to present their skills
看起来像那样
Founded in 2008, Stack Overflow sees 40 million visitors each month
<span>|| <b>ID</b> || <b>Column1</b> || <b>Column2</b> ||
| | | |</span>
Stack Overflow Documentation, the largest content expansion since Q&A, launches in July
<span>|| <b>Name</b> || <u>Surname</u> || <u>DoB</u> ||
| | | |
| | | |
</span>
The Developer Story launches in October, giving developers a better way to present their skills
如果我尝试这样的正则表达式
(
(
(^|\r\n|)+(\|{1,2})
)
(
[\s\S]*
)
(\|{1,2}
($|\r\n|)+
)
)
但它不是我需要的,它选择了错误的区域,你可以在这里看到https://regex101.com/r/0h7gVV/2
其他尝试看起来像那样
((^|\r\n{2,}|)+(\|{1,2}))(.*)(\|{1,2}(\r\n{2,}|$|)+)
但最终选择了每一行,您可以在此处查看示例https://regex101.com/r/qpwdwj/2
我应该如何更改正则表达式以使其正常工作?
UPD
WiktorStribiżew(感谢他)在评论中告诉我尝试他的例子,它在上面的例子中表现良好,但不适用于所有可能的情况(例如https://regex101.com/r/PvPsxF/3}
所谓的表看起来像那样
|| A | B |
|| c | d |
或那
| a | b | c |
| d | e | f |
UPD2
那是一个https://regex101.com/r/PvPsxF/7,但它有空的空间
UPD3
这个是关闭的(https://regex101.com/r/PvPsxF/8),但对于此测试文本
Stack Overflow Documentation, the largest content expansion since Q&A, launches in July
|| <b>Name</b> || <u>Surname</u> || <u>DoB</u> ||
| | | |
||
| a | b | c | u |
The Developer Story launches in October, giving developers a better way to present their skills
| a | b | c |
| d | e | f |
就像那样
Stack Overflow Documentation, the largest content expansion since Q&A, launches in July
<span>
|| <b>Name</b> || <u>Surname</u> || <u>DoB</u> ||
| | | |
<!-- not suppose to be wraped up -->
</span><span>||
| a | b | c | u |
</span>The Developer Story launches in October, giving developers a better way to present their skills
<span>
| a | b | c |
| d | e | f |</span>
当我不想在行内包裹单个||
外观时(在这种情况下假设被忽略)
P.S。
这就是说,下面的标记
|| <b>ID</b> || <b>Column1</b> || <b>Column2</b> ||
| | | |
将解析为html看起来像表格,其中|| Cell ||
代表标题,| cell |
代表常规单元格
所以,解析之后会看起来像
<table>
<tr>
<th>ID</th>
<th>Column1</th>
<th>Column2</th>
</tr>
<tr>
<td> </td>
<td> </td>
<td> </td>
</tr>
</table>
答案 0 :(得分:1)
正则表达式是
(\|\|?([^|\n\r]+\|\|?)+($|[\r\n]+))+
匹配组为$0
(demo)。
它的工作原理如下:
(
\|\|? #the line starts with one or two pipes
(
[^|\n\r]+ #followed by at least one non-pipe characther
\|\|? #and the cell endt with one or two pipes
)+ #at least one cell, otherwise even the line "||" would be matched
(
$ #the text ends (you are NOT in multiline mode)
|
[\r\n]+ #or [\r\n] characters are matched (at least one, otherwise would match even "||A|B"), in order to match also the possible following line
)
)+ #at least one line
如果你不想在“表格”之后匹配空格/新行,只需使用一个更难的正则表达式(demo):
\|\|?([^|\n\r]+\|\|?)+$([\r\n]+\|\|?([^|\n\r]+\|\|?)+$)*
在最后一个正则表达式中,请记住使用m
标志。