我试图在Excel vba中创建一个应用程序,但我有些问题。我需要我的Excel应用程序来下载一些zip格式的文件。我已经完成了这部分问题,我的应用程序可以下载并解压缩文件。接下来我要读取.htm扩展名中的整个文件并从中获取一些信息。需要像这样工作,当应用程序打开时,程序应该查找在Basil中的“concurso”的最后一个竞赛号码,然后在.htm文件中查找相同的号码并开始复制下一个数据。
我已经发现了一种读取文件并获取我想要的数据的模式,但我不知道如何编写它。要提取的.htm文件中的模式是内部标签td,一个有2个斜线的文本,因此,我有一个日期,此时,我要做3件事,得到日期,日期上面的行我的concurso数量,所以我也需要得到它,并且15天以下的15行我也需要它们。此模式不会更改,必须处理直到.htm文件结束。并将这些数据传输到我的工作表以便稍后进行操作。
如果对这个问题有些怀疑,我会提供进一步的澄清 这是我用来下载和解压缩文件的代码。↓
Sub DownloadEUnzip()
Dim FSO, oApp As Object
Dim objHttp, DefPath, Arquivo As String
Dim Dados() As Byte
Dim Fname As Variant
Dim FileNameFolder As Variant
Dim iFileNumber As Long
Dim diretorio As String
diretorio = Dir("c:\lotofacil\D_LOTFAC.HTM")
If diretorio = "D_LOTFAC.HTM" Then
Kill "C:\lotofacil\*"
End If
Set objHttp = CreateObject("MSXML2.ServerXMLHTTP")
objHttp.Open "GET", "http://www1.caixa.gov.br/loterias/_arquivos/loterias/D_lotfac.zip", False
objHttp.Send
DefPath = "C:\lotofacil\" '<<< Altere aqui
Arquivo = DefPath & "D_lotfac.zip"
If objHttp.Status = "200" Then
Dados = objHttp.ResponseBody
iFileNumber = FreeFile
Open Arquivo For Binary Access Write As #iFileNumber
Put #iFileNumber, 1, Dados
Close #iFileNumber
End If
If Right(DefPath, 1) <> "\" Then
DefPath = DefPath & "\"
End If
FileNameFolder = DefPath
Set oApp = CreateObject("Shell.Application")
oApp.Namespace(FileNameFolder).CopyHere oApp.Namespace("C:\lotofacil\D_lotfac.zip").items
MsgBox "Arquivos baixados e descompactados com sucesso!"
End Sub
HERE可以下载文件以查看问题。
ps 在C:驱动器上创建一个名为lotofacil的文件夹,以便电子表格正常工作。
更新1
查找日期的代码
If Mid(dataline, 19, 1) = "/" And Mid(dataline, 22, 1) = "/" Then
Debug.Print dataline
End If
更新2
所以caio,现在真的很快,但是当我使用时我注意到该程序占用了一个不到需要的列,并且我更改了代码并且它的工作原理显然......就像你看一看如果我没有任何混乱...... 我改变了数组的大小,看起来很有用:)看看。
Sub ReadLines()
Dim dataArray() As String
Dim strText
Dim result As String
Dim regExDate As New RegExp, regExAnyContent As New RegExp
Dim matches As MatchCollection
Dim match As match
Dim previous As String, current As String
Dim currentLine As Integer
ReDim dataArray(17, 1000)
regExDate.Pattern = "(\d{2}/\d{2}/\d{4})"
regExAnyContent.Pattern = "<td[^>]*>([^<]*)"
dirPath = "c:\lotofacil\"
filePath = dirPath & "D_LOTFAC.HTM"
result = ""
currentLine = 0
If Not Dir(filePath) = "D_LOTFAC.HTM" Then Exit Sub
FileNum = FreeFile()
Open filePath For Input As #FileNum
previous = ""
While Not EOF(FileNum)
Line Input #FileNum, current ' read in data 1 line at a time
If Len(current) > 0 Then
Set matches = regExDate.Execute(current)
If matches.Count > 0 Then
dataArray(1, currentLine) = matches.Item(0)
dataArray(0, currentLine) = regExAnyContent.Execute(previous).Item(0).SubMatches(0)
For i = 1 To 16
Line Input #FileNum, current
While current = ""
Line Input #FileNum, current
Wend
dataArray(1 + i, currentLine) = regExAnyContent.Execute(current).Item(0).SubMatches(0)
Next
currentLine = currentLine + 1
If currentLine Mod 1000 = 0 Then
ReDim Preserve dataArray(17, currentLine + 1000)
End If
End If
previous = current
End If
' decide what to do with dataline,
' depending on what processing you need to do for each case
Wend
Range(Cells(1, 1), Cells(currentLine, 17)) = Application.Transpose(dataArray)
End Sub
但仍然发生了一件非常奇怪的事情,在数据表中,日期是错误的,我需要它们的格式为dd / mm / yyyy,我知道我已经在烦你了,但如果是太难做出这个改变你可以取消这个日期的专栏吗?请...
首先非常感谢你,你真的很擅长excel;)
答案 0 :(得分:1)
尝试将此文件读取到剪贴板并将其内容粘贴到工作表中,这将创建一个您可以使用的普通Excel表。
这将使用excel的自然能力将html表解析为常规excel表。
Sub ReadFilePasteAsTable() Dim objData As New MSForms.DataObject Dim strText Dim result As String Dim numberOfLines Integer Dim wsh As Object Set wsh = VBA.CreateObject("WScript.Shell") numberOfLines = 126 dirPath = "c:\lotofacil\" diretorio = Dir(dirPath & "D_LOTFAC.HTM") result = "" If Not diretorio = "D_LOTFAC.HTM" Then Exit Sub FileNum = FreeFile() filePath = dirPath & "D_LOTFAC.HTM" outPath = dirPath & "out.txt" pscommand = "Powershell -Command ""''+$(cat " & filePath & " -Tail 126) > " & outPath & """" wsh.Run pscommand, 0, True Open outPath For Input As #FileNum While Not EOF(FileNum) Line Input #FileNum, DataLine ' read in data 1 line at a time result = result & DataLine ' decide what to do with dataline, ' depending on what processing you need to do for each case Wend objData.SetText result objData.PutInClipboard ActiveSheet.Paste Destination:=[A1] End Sub
不要忘记添加对Microsoft Forms 2.0的引用。要添加引用打开VBA窗口,请打开菜单Tools-&gt; References ...
如果找不到 Microsoft Forms 2.0对象库,请打开浏览...对于64位操作系统或C:\ Windows \ System32,它将位于C:\ Windows \ SysWOW64 \ FM20.dll中\ FM20.dll为32位。
现在您需要添加对Microsoft VBScript Regular Expressions 5.5
Sub ReadLines()
Dim dataArray() As String
Dim strText
Dim result As String
Dim regExDate As New RegExp, regExAnyContent As New RegExp
Dim matches As MatchCollection
Dim match As match
Dim previous As String, current As String
Dim currentLine As Integer
ReDim dataArray(16, 1000)
regExDate.Pattern = "(\d{2}/\d{2}/\d{4})"
regExAnyContent.Pattern = "<td[^>]*>([^<]*)"
dirPath = "c:\lotofacil\"
filePath = dirPath & "D_LOTFAC.HTM"
result = ""
currentLine = 0
If Not Dir(filePath) = "D_LOTFAC.HTM" Then Exit Sub
FileNum = FreeFile()
Open filePath For Input As #FileNum
previous = ""
While Not EOF(FileNum)
Line Input #FileNum, current ' read in data 1 line at a time
If Len(current) > 0 Then
Set matches = regExDate.Execute(current)
If matches.Count > 0 Then
dataArray(1, currentLine) = matches.Item(0)
dataArray(0, currentLine) = regExAnyContent.Execute(previous).Item(0).SubMatches(0)
For i = 1 To 15
Line Input #FileNum, current
While current = ""
Line Input #FileNum, current
Wend
dataArray(1 + i, currentLine) = regExAnyContent.Execute(current).Item(0).SubMatches(0)
Next
currentLine = currentLine + 1
If currentLine Mod 1000 = 0 Then
ReDim Preserve dataArray(16, currentLine + 1000)
End If
End If
previous = current
End If
' decide what to do with dataline,
' depending on what processing you need to do for each case
Wend
Range(Cells(1, 1), Cells(currentLine, 16)) = Application.Transpose(dataArray)
End Sub