我想下载在线页面的内部html,但是当我这样做时,像šđčćž这样的字符会被ć¡取代等等。
我正在使用的代码:
Dim sourceString As String = New System.Net.WebClient().DownloadString("SomeWebPage")
TextBox1.Text = sourceString
答案 0 :(得分:2)
您可能必须下载字节,然后使用Encoding
类转换为UTF8:
Async Function GetHtmlString(address As String) As Task(Of String)
Using client As New WebClient
Dim bytes = Await client.DownloadDataTaskAsync(address)
Dim s = Encoding.UTF8.GetString(bytes)
return s
End Using
End Function
感谢@ dave的评论更简单:
Async Function GetHtmlString(address As String) As Task(Of String)
Using client As New WebClient
client.Encoding = Encoding.UTF8
Dim s = Await client.DownloadStringTaskAsync(address)
return s
End Using
End Function
用法示例:
Imports System.Net
Imports System.Text
Public Class Form1
Private Async Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
Dim s = Await GetHtmlString("http://www.radiomerkury.pl/")
End Sub
Async Function GetHtmlString(address As String) As Task(Of String)
Using client As New WebClient
client.Encoding = Encoding.UTF8
Dim s = Await client.DownloadStringTaskAsync(address)
Return s
End Using
End Function
End Class
答案 1 :(得分:0)
Kibi,我认为你的方式远远不够。我不知道VB.NET将如何帮助你解决这类问题。下面是一个简单,直观的Excel& VBA解决方案。我希望这有助于您实现目标。
Sub DumpData()
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
URL = "http://finance.yahoo.com/q?s=sbux&ql=1"
'Wait for site to fully load
IE.Navigate2 URL
Do While IE.Busy = True
DoEvents
Loop
RowCount = 1
With Sheets("Sheet1")
.Cells.ClearContents
RowCount = 1
For Each itm In IE.document.all
.Range("A" & RowCount) = itm.tagname
.Range("B" & RowCount) = itm.ID
.Range("C" & RowCount) = itm.classname
.Range("D" & RowCount) = Left(itm.innertext, 1024)
RowCount = RowCount + 1
Next itm
End With
End Sub