从众多Excel文件中提取到一个数据表或文件中

时间:2014-08-21 12:19:51

标签: sql excel vba access-vba

我有超过100个.xlsx文件。每个文件有两张。第一张表(通常称为sts)通常有15-20,000行,其中一列名为" Code"。第二张表(总是称为cps)有大约85k行,也有相同的代码列。

我需要从表单中提取特定代码的所有行,将表格/表单以及特定代码的所有行从表格cps提取到第二个表格/表格中。我需要为所有文件执行此操作。

我已尝试过两种方法

1)使用Excel VBA打开每个文件,使用自动过滤器将所需的代码行复制到主工作簿中进行整理。使用以下代码从预定义的起始目录中获取文件,并向下钻取Public Sub SearchFiles()

Public Sub SearchFiles()

'Macro to start the file extraction by drilling down from the mydir path specified
Dim code As String
Dim time1 As Double
Dim time2 As Double

Range("a1").Value = InputBox("Please type code to extract", code)
time1 = Timer

myFileSearch _
myDir:="C:\Data\Dashboard\2014\New Files Excel Loop", _
FileNameLike:="Reporting", _
FileTypeLike:=".xlsx", _
SearchSubFol:=True, _
myCounter:=0

time2 = Timer
MsgBox time2 - time1 & "seconds"

End Sub


Private Sub myFileSearch(myDir As String, FileNameLike As String, FileTypeLike As String, _
  SearchSubFol As Boolean, myCounter As Long)
Dim fso As Object, myFolder As Object, myFile As Object
Dim Rowcount As Long
Dim rowcount2 As Long
Dim masterbook As Workbook
Set masterbook = ThisWorkbook
Set fso = CreateObject("Scripting.FileSystemObject")
Dim commodity As String

code = Range("a1").Value

Application.ScreenUpdating = False

For Each myFile In fso.GetFolder(myDir).Files
    Workbooks.Open (myDir & "\" & myFile.Name)
    myCounter = myCounter + 1
    ReDim Preserve myList(1 To myCounter)
    myList(myCounter) = myDir & "\" & myFile.Name

    ''loop to pull out all code rows in your directories into new file
    Workbooks(Workbooks.Count).Worksheets(1).Range("d2").Activate
    Rowcount = Workbooks(1).Sheets(1).Range("a1").CurrentRegion.Rows.Count + 1
    Rows(1).AutoFilter
    Range("A1").AutoFilter Field:=3, Criteria1:=code, Operator:=xlAnd
    Range("A1").CurrentRegion.SpecialCells(xlCellTypeVisible).Copy _
    Destination:=Workbooks(1).Sheets(1).Range("a" & Rowcount)

    'filter out the code data
    Workbooks(Workbooks.Count).Worksheets(2).Activate
    Range("d2").Activate
    rowcount2 = Workbooks(1).Sheets(2).Range("a1").CurrentRegion.Rows.Count + 1
    Rows(1).AutoFilter
    Range("A1").AutoFilter Field:=6, Criteria1:=code, Operator:=xlAnd
    Range("A1").CurrentRegion.SpecialCells(xlCellTypeVisible).Copy _
      Destination:=Workbooks(1).Sheets(2).Range("a" & Rowcount)

    Workbooks(myFile.Name).Close savechanges:=False
Next

If SearchSubFol Then
    For Each myFolder In fso.GetFolder(myDir).SubFolders
        myFileSearch myDir & "\" & myFolder.Name, FileNameLike, FileTypeLike, True, myCounter
    Next
End If

End Sub

打开每个工作簿需要5-10秒,整个过程非常缓慢(目前还有错误)。

2)将所有内容导入两个Access表,然后清除我想要的代码行。由于行数的原因,这比Excel方法慢。

Sub pulloop()

DoCmd.RunSQL "delete * from sts"
DoCmd.RunSQL "delete * from cps"

strSql = "PathMap"
Set rs = CurrentDb.OpenRecordset(strSql)

With rs

    If Not .BOF And Not .EOF Then
        .MoveLast
        .MoveFirst

        While (Not .EOF)
            importfile = rs.Fields("Path")

            DoCmd.TransferSpreadsheet acimport, acSpreadsheetTypeExcel12, "Sts", importfile, True, "Sts!A:G"

           DoCmd.TransferSpreadsheet acimport, acSpreadsheetTypeExcel12, "CPs", importfile, True, "CPs!A:Q"
            'Debug.Print rs.Fields("Path")
            .MoveNext
        Wend

    End If

    .Close

End With

End Sub

我改编了这个尝试并使用AcLink,但我正在努力实现它。是否有可能使用aclink而不是acimport来查询每个文件在访问Access时所需的代码行,如果是这样,这可能是一种更快的方式?

2 个答案:

答案 0 :(得分:0)

看起来您的第二个选项中的一个问题,我倾向于支持,是您从Excel文件导入所有行。尝试使用Excel对象模型在两个工作表上定义命名范围,然后在循环中使用docmd.transferspreadsheet。您需要更改另一个工作表的列引用。 HTH。

查找使用的实际行的代码,定义命名范围并导入Access:

Dim xlApp As Excel.Application     
Dim xlWkb As Excel.Workbook       
Dim xlWS As Excel.Worksheet 
Dim lngLastRow as Long
Dim myImportRange as Range
dim strRangeName as String
set xlApp = New Excel.Application
xlApp.Visible=False 'make it go faster
set xlWB = xlApp.Workbooks.Open("PATH")
set xlWS = xlWB.Sheets("sts")
lngLastRow=xlWS.Range("A" & xlWS.Rows.Count).End(xlUp).Row
Set myImportRange = xlWS.Range("A1:G" & lnglastrow)
strRangeName="myData_2014MMDD"  'or any name that makes sense to you
myImportRange.Name=strRangeName
xlWB.Save
DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel12, <Dest Table>, xlWb.FullName, True, strRangeName
xlApp.DisplayAlerts=False 'suppress save changes prompts
xlWB.Close False 

答案 1 :(得分:0)

考虑第三种直接在附加SQL查询中查询工作簿的方法:

With rs
   .MoveLast
   .MoveFirst

   While (Not .EOF)                
      importfile = rs.Fields("Path")
      Debug.Print importfile

      sql = "INSERT INTO sts " _
          & " SELECT * FROM [Excel 12.0 Xml;HDR = Yes;Database=" & importfile & "].[Sts$A:G]"

       CurrentDb.Execute sql, dbFailOnError

      .MoveNext
   Wend        

   .Close    
End With