使用VBA将本地HTML文件读入字符串

时间:2013-08-17 08:08:24

标签: html vba excel-vba io excel

这感觉应该很简单。我的计算机上存有.HTML文件,我想将整个文件读成字符串。当我尝试超级简单的

Dim FileAsString as string 

Open "C:\Myfile.HTML" for input as #1
Input #1, FileAsString
Close #1

debug.print FileAsString

我没有得到整个文件。我只得到前几行(我知道即时窗口切断,但这不是问题。我绝对不会将整个文件放入我的字符串中。)我也尝试使用文件系统对象的替代方法,并且得到了类似的结果,只是这次有很多奇怪的字符和问号投入。这让我觉得它可能是某种编码问题。 (虽然坦率地说,我并不完全明白这意味着什么。我知道有不同的编码格式,这可能会导致字符串解析出现问题,但这就是它。)

更一般地说,这是我真正想知道的:我如何使用vba打开任何扩展名(可以在文本编辑器中查看)和长度(不超过VBA的字符串限制)的文件),并确保我在基本文本编辑器中看到的任何字符都被读入字符串? (如果不能(轻松)完成,我当然会感谢被指向可能与.html文件一起使用的方法)非常感谢您的帮助

编辑: 这是我使用建议方法时会发生什么的一个例子。具体地

    Dim oFSO As Object
    Dim oFS As Object, sText As String

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFS = oFSO.OpenTextFile(Path)

    Do Until oFS.AtEndOfStream
        sText = oFS.ReadAll()
    Loop
    FileToString = sText

    Set oFSO = Nothing
    Set oFS = Nothing

End Function

我将向您展示开头(通过消息框)和结束(通过即时窗口),因为两者在不同方面都很奇怪。在这两种情况下,我都会将它与chrome中显示的html源的屏幕截图进行比较:

起点: enter image description here

enter image description here

结束: enter image description here

enter image description here

3 个答案:

答案 0 :(得分:5)

这是一种方法

Option Explicit

    Sub test()

    Dim oFSO As Object
    Dim oFS As Object, sText As String

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFS = oFSO.OpenTextFile("C:\Users\osknows\Desktop\import-store.csv")

    Do Until oFS.AtEndOfStream
    ' sText = oFS.ReadLine 'read line by line
    sText = oFS.ReadAll()
    Debug.Print sText
    Loop
    End Sub

编辑:

尝试将以下行更改为以下3行之一,看看是否有任何区别

http://msdn.microsoft.com/en-us/library/aa265347(v=vs.60).aspx

Set FS = FSO.OpenTextFile("C:\Users\osknows\Desktop\import-store.csv", 1, 0)
Set FS = FSO.OpenTextFile("C:\Users\osknows\Desktop\import-store.csv", 1, 1)
Set FS = FSO.OpenTextFile("C:\Users\osknows\Desktop\import-store.csv", 1, 2)

EDIT2:

此代码是否适合您?

Function ExecuteWebRequest(ByVal url As String) As String

    Dim oXHTTP As Object

    Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
    oXHTTP.Open "GET", url, False
    oXHTTP.send
    ExecuteWebRequest = oXHTTP.responseText
    Set oXHTTP = Nothing

End Function

Function OutputText(ByVal outputstring As String)
    MyFile = ThisWorkbook.Path & "\temp.html"
    'set and open file for output
    fnum = FreeFile()
    Open MyFile For Output As fnum
    'use Print when you want the string without quotation marks
    Print #fnum, outputstring
    Close #fnum
End Function

Sub test()
Dim oFSO As Object
Dim oFS As Object, sText As String
Dim Uri As String, HTML As String

    Uri = "http://www.forrent.com/results.php?search_type=citystate&page_type_id=city&seed=859049165&main_field=12345&ssradius=-1&min_price=%240&max_price=No+Limit&sbeds=99&sbaths=99&search-submit=Submit"
    HTML = ExecuteWebRequest(Uri)
    OutputText (HTML)

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFS = oFSO.OpenTextFile(ThisWorkbook.Path & "\temp.html")

    Do Until oFS.AtEndOfStream
    ' sText = oFS.ReadLine 'read line by line
    sText = oFS.ReadAll()
    Debug.Print sText
    Loop

End Sub

enter image description here

答案 1 :(得分:1)

好的,所以我终于弄清楚了。 VBA文件系统对象只能读取asciiII文件,并且我已将其保存为unicode。有时,在我的情况下,保存asciiII文件可能会导致错误。但是,您可以通过将文件转换为二进制文件然后再转换为字符串来解决此问题。详细信息请参见http://bytes.com/topic/asp-classic/answers/521362-write-xmlhttp-result-text-file

答案 2 :(得分:0)

回答得有点晚,但是我今天确实做了这件事(效果很好):

Sub modify_local_html_file()
    Dim url As String
    Dim html As Object
    Dim fill_a As Object

    url = "C:\Myfile.HTML"

    Dim oFSO As Object
    Dim oFS As Object, sText As String

    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFS = oFSO.OpenTextFile(url)

    Do Until oFS.AtEndOfStream
        sText = oFS.ReadAll()
        Debug.Print sText
    Loop

    Set html = CreateObject("htmlfile")
    html.body.innerHTML = sText

    oFS.Close
    Set oFS = Nothing

    '# grab some element #'
    Set fill_a = html.getElementById("val_a")

    MsgBox fill_a.innerText

    '# change its inner text #'
    fill_a.innerText = "20%"

    MsgBox fill_a.innerText

    '# open file this time to write to #'
    Set oFS = oFSO.OpenTextFile(url, 2)

    '# write it modified html #'
    oFS.write html.body.innerHTML
    oFS.Close

    Set oFSO = Nothing
    Set oFS = Nothing

End Sub