搜索列的网址,将网页保存为单个文本文件

时间:2013-02-26 03:30:11

标签: excel vba excel-vba

我这里的代码适用于硬编码的网址,它只适用于一个网址和一个文本文件。

Sub saveUrl_Test()

Dim FileName As String
Dim FSO As Object
Dim ieApp As Object  
Dim Txt As String
Dim TxtFile As Object
Dim URL As String

    URL = "www.bing.com"
    FileName = "C:\mallet\bing.com.txt"

    Set FSO = CreateObject("Scripting.FileSystemObject")
    Set TxtFile = FSO.OpenTextFile(FileName, 2, True, -1)

    Set ieApp = CreateObject("InternetExplorer.Application")
    ieApp.Visible = True
    ieApp.Navigate URL

    While ieApp.Busy Or ieApp.ReadyState <> 4
        DoEvents
    Wend

    Txt = ieApp.Document.body.innerText
    TxtFile.Write Txt
    TxtFile.Close

    ieApp.Quit

    Set ieApp = Nothing
    Set FSO = Nothing
End Sub

我想要它做的是在列B中搜索URL(可能使用InStr(变量,“http://”)作为布尔值),然后将每个网页保存为单独的文本文件。是否有办法使用部分URL字符串命名文本文件?此外,有没有办法让网页不打开,但仍然保存为文本文件?打开网页浪费了很多时间。

我根据@MikeD的建议创建了这个附加子,但是我在没有错误的情况下得到了它。

Sub url_Test(URL As String, FileName As String)

Dim FSO As Object
Dim ieApp As Object
Dim Txt As String
Dim TxtFile As Object

    Set FSO = CreateObject("Scripting.FileSystemObject")
    Set TxtFile = FSO.OpenTextFile(FileName, 2, True, -1)

    Set ieApp = CreateObject("InternetExplorer.Application")
    ieApp.Visible = True
    ieApp.Navigate URL

    While ieApp.Busy Or ieApp.ReadyState <> 4
        DoEvents
    Wend

    Txt = ieApp.Document.body.innerText
    TxtFile.Write Txt
    TxtFile.Close

    ieApp.Quit

    Set ieApp = Nothing
    Set FSO = Nothing
End Sub

Sub LoopOverB()  
Dim myRow As Long

    myRow = 10

    While Cells(myRow, 2).Value <> ""

        If InStr(1, Cells(myRow, 2).Value, "http:\\", vbTextCompare) Then Call url_Test(Cells(myRow, 2).Value, "C:\mallet\test\" & Cells(myRow, 1).Value & ".txt")
        myRow = myRow + 1
    Wend 
End Sub

1 个答案:

答案 0 :(得分:0)

首先,您可以参数化子

Sub saveUrl_param(URL as String, FileName as String)
    ....
End Sub

并删除DimURL

FileName和作业语句

其次,你编写另一个Sub,循环遍历B列中的非空单元格,检索值并有条件地调用saveUrl_param()例程。

示例:

Sub LoopOverB()
Dim C As Range
    For Each C In Intersect(ActiveSheet.Columns("B"), ActiveSheet.UsedRange).SpecialCells(xlCellTypeConstants)
        ' If C = .... Then ' note: URL in [B], filename in [C]
        '     saveUrl_param(C, C(1,2))
        ' End If
    Next C
End Sub

和否 - 如果不打开网页就无法做到;你不知何故必须从服务器(或代理)获取页面。这是由

完成的
ieApp.Navigate URL

并且以下While ... Wend构造等待页面完全加载到浏览器对象中。

加快你可以跳过的事情

ieApp.Visible = True

一旦您确信您的Sub工作正常,您就可以移动

Dim ieApp As Object ' I would prefer As SHDocVw.InternetExplorer .... don't like late binding
Set ieApp = CreateObject("InternetExplorer.Application")

到调用sub并将ieApp对象移交给子程序作为参数,以便不再打开/关闭浏览器&amp;试。