所以,我有这个HTML代码:
<div class="keyboard">
<p>
Hello world!
</p>
</div>
我想获得文字&#34; Hello world!&#34;。我已尝试使用下面的正则表达式代码,但它没有用。
Dim findtext2 As String = "(?<=<div class=""keyboard"">)(.*?)(?=</div>)"
Dim myregex2 As String = TextBox1.Text 'HTML code above
Dim doregex2 As MatchCollection = Regex.Matches(myregex2, findtext2)
Dim matches2 As String = ""
For Each match2 As Match In doregex2
matches2 = matches2 + match2.ToString + Environment.NewLine
Next
MsgBox(matches2)
答案 0 :(得分:0)
正如评论中提到的那样,不要使用ReGex来解析HTML代码 而是使用LINQ to XML
Dim html As XElement =
<html>
<body>
<div class = "keyboard">
<p>Hello word!</p>
</div>
</body>
</html>
Dim values As String =
html.Descendants("div").
Where(Function(div) div.Attribute("class").Value.Equals("keyboard")).
Select(Function(div) div.Element("p").Value)
For Each value As String in values
Console.WriteLine(value);
End For