我有一个包含名称和网址的表格,如下所示:
<tr>
<td>name1</td>
<td>www.url.com</td> </tr>
<tr>
<td>name2</td>
<td>www.url2.com</td> </tr>
我想在表格中选择所有URL-tabledata。 我试过了:
<td>w{3,3}.*(</td>){1,1}
但是这个表达式并没有在第一个</td>
处“停止”。我明白了:
<td>www.url.com</td> </tr>
<tr>
<td>name2</td>
<td>www.url2.com</td>
结果。我的错误在哪里?
答案 0 :(得分:1)
有几种方法可以匹配网址。我会尽量满足您的需求:纠正您的正则表达式。你可以改用这个:
<td>w{3}.*?</td>
说明:
<td> # this part is ok
w{3,3} # the notation {3} is simpler for this case and has the same effect
.* # the main problem: you have to use .*? to make .* non-greedy, that
is, to make it match as little as possible
(</td>){1,1} # same as second line. As the number is 1, {1} is not needed
答案 1 :(得分:0)
你的正则表达式可以
\b(https?|ftp|file)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]*[-A-Za-z0-9+&@#/%=~_|]
或
"((((ht{2}ps?://)?)((w{3}\\.)?))?)[^.&&[a-zA-Z0-9]][a-zA-Z0-9.-]+[^.&&[a-zA-Z0-9]](\\.[a-zA-Z]{2,3})"
请参阅此链接 - What is the best regular expression to check if a string is a valid URL?。有很多答案。