可变日期vba的网页抓取

时间:2018-07-12 09:50:11

标签: excel vba web-scraping

寻找一种从内部网络抓取/导入网页并将其粘贴到excel中的解决方案。

客观 周二至周五-访问网页并导入当天和前一天的数据。在星期一,它需要导入当天和前3天(周日,周六和周五)的数据。

我昨天录制了宏,该宏复制其中包含网址的excel单元格,粘贴到“新建Web查询”中的地址字段中,并进行导入过程,并在前一天重复。

这提供了预期的结果,但是当我今天早晨再次运行宏时,由于网址是硬编码的,因此它返回了昨天和前一天的数据。

我将网址的开始与日期元素连接起来,今天要访问的网页的地址位于单元格K2,前一天K3,-2天K4和-3天K5中。

网页地址的常量部分以http:/ ..... prd03开头!后跟变量yyyy!mm!dd

例如今天的http:/ ..... prd03!2018!07!12 例如昨天的http:/ ..... prd03!2018!07!11

明天http:/ ..... prd03!2018!07!12是昨天

下面是宏记录生成的代码 结尾为

Application.CutCopyMode = False
Range("K2").Select
ActiveCell.FormulaR1C1 = _
    "http:....prd03!2018!07!11" 'can't show full address
Range("G9").Select
ActiveWorkbook.Worksheets.Add
ActiveSheet.Name = "Today"
With ActiveSheet.QueryTables.Add(Connection:= _
    "URL;http:....prd03!2018!07!11", _ ' the URL is hard coded
    Destination:=Range("$A$1"))
    .Name = "...prd03!2018!07!11" 'can't show full name
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlEntirePage
    .WebFormatting = xlWebFormattingNone
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
End With

我已经打开了网页html源代码,今天的日期如下所示

TITLE =“主页名称” HREF =“ / ......!prd03!2018!07!12”> 12

您能提供的任何帮助将不胜感激。如果需要更多信息,请告诉我。

1 个答案:

答案 0 :(得分:2)

在VBA中,您可以对URL进行编码以包括日期:

Dim fmtToday As String
Dim fmtYesterday As String
Dim fmtTwoDays As String
Dim fmtThreeDays As String
Dim BaseURL As String

BaseURL = "....prd03!" ' the first part of your url, change this to reflect your actual URL excluding http://

fmtToday = BaseURL & Format(Now, "yyyy!mm!dd") ' combine the BaseURL with the formated date
fmtYesterday = BaseURL & Format(Now - 1, "yyyy!mm!dd")  'combine the BaseURL with the formated date minus 1 day
fmtTwoDays = BaseURL & Format(Now - 2, "yyyy!mm!dd")  ' combine the BaseURL with the formated date minus 2 days
fmtThreeDays = BaseURL & Format(Now - 3, "yyyy!mm!dd")  ' combine the BaseURL with the formated date minus 3 days

然后您可以在代码中引用它们:

Application.CutCopyMode = False

Range("K2").Value = "http://" & fmtToday

ActiveWorkbook.Worksheets.Add
ActiveSheet.Name = "Today"

With ActiveSheet.QueryTables.Add(Connection:= _
    "URL;http://" & fmtToday, _
    Destination:=Range("$A$1"))
    .Name = fmtToday
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlEntirePage
    .WebFormatting = xlWebFormattingNone
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
End With

我已经调整了您的代码以与fmtToday一起使用,要与以前的日子一起使用,您将需要相应地调整代码。

相关问题