我正在建造一个小房子"使用VB的应用程序正如标题所说,我想从本地html文件中获取部分文本并将其用作变量,或者将其放在文本框中。
我尝试过这样的事情......
Private Sub Open_Button_Click(sender As Object, e As EventArgs) Handles Open_Button.Click
Dim openFileDialog As New OpenFileDialog()
openFileDialog.CheckFileExists = True
openFileDialog.CheckPathExists = True
openFileDialog.FileName = ""
openFileDialog.Filter = "All|*.*"
openFileDialog.Multiselect = False
openFileDialog.Title = "Open"
If openFileDialog.ShowDialog = Windows.Forms.DialogResult.OK Then
Dim fileReader As String = My.Computer.FileSystem.ReadAllText(openFileDialog1.FileName)
TextBox.Text = fileReader
End If
End Sub
结果是将整个html代码加载到此文本框中。我该怎么做才能获取html文件代码的特定部分?让我们说我只想从这个范围中抓取文字...... <span id="something">This is a text!!!</a>
答案 0 :(得分:1)
我对这个答案做了以下假设。
我会做这样的事情:
' get the html document Dim fileReader As String = My.Computer.FileSystem.ReadAllText(openFileDialog1.FileName) ' split the html text based on the span element Dim fileSplit as string() = fileReader.Split(New String () {"<span id=""something"">"}, StringSplitOptions.None) ' get the last part of the text fileReader = fileSplit.last ' we now need to trim everything after the close tag fileSplit = fileReader.Split(New String () {"</span>"}, StringSplitOptions.None) ' get the first part of the text fileReader = fileSplit.first ' the fileReader variable should now contain the contents of the span tag with id "something"
注意:此代码未经测试,我已在堆栈交换移动应用中输入,因此可能会有一些自动纠正错误。
您可能希望添加一些错误验证,例如确保span元素仅出现一次,等等。
答案 1 :(得分:1)
强烈建议使用HTML解析器,因为HTML语言包含许多嵌套标记(例如,请参阅this question)。
但是,使用Regex
查找单个标记的内容是可能的,如果HTML格式正确,则没有更大的问题。
这就是你需要的(函数不区分大小写):
Public Function FindTextInSpan(ByVal HTML As String, ByVal SpanId As String, ByVal LookFor As String) As String
Dim m As Match = Regex.Match(HTML, "(?<=<span.+id=""" & SpanId & """.*>.*)" & LookFor & "(?=.*<\/span>)", RegexOptions.IgnoreCase)
Return If(m IsNot Nothing, m.Value, "")
End Function
该功能的参数是:
HTML
:HTML代码为字符串。
SpanId
:范围的ID(例如<span id="hello">
- hello 是id)
LookFor
:在范围内要查找的文字。
在线测试: http://ideone.com/luGw1V