如何将Access 2013中的大型ADO记录集传输到CSV文件?

时间:2014-11-13 22:38:03

标签: sql-server vba csv ado

在将大型ADO记录集从Access 2013传输到新的CSV文件时,我遇到了一些性能问题。

我的要求是:
1)文件必须包含列名称
2)Recordset可包含超过500,000条记录
3)文件必须以逗号分隔 4)记录包含有时包含逗号的字母数字字符串
5)程序必须创建一个新的CSV文件(不只是更新预先存在的文件)
注意:数据存储在记录集中,因为它是从MS-SQL Server查询的

我对VBA编程比较新,所以任何帮助都会受到高度赞赏。我读到从文件的开头到结尾运行的GetRows循环可以是将大型记录集导出为CSV的有效过程。致谢

这是我目前的实施,显然违反了我的一些要求:

    'WRITE DATA TO TEXT FILE
     Dim f As ADODB.Field
     Dim myFileSystemObject As Object
     Dim txtfile As Object
     Set myFileSystemObject = CreateObject("Scripting.FileSystemObject")
     Set txtfile = myFileSystemObject.CreateTextFile(strPath & ".txt", True)
     With adRs
      For Each f In .Fields
         txtfile.Write (f.Name)
         txtfile.Write Chr(9)
      Next
      txtfile.WriteLine
      txtfile.Write adRs.GetString(adClipString)
      .Close
     End With
     txtfile.Close

2 个答案:

答案 0 :(得分:0)

你可以试试这个(虽然我不知道表现会受到影响;你的问题似乎有两个 - 我如何得到我的结果,然后我怎样才能有效地做到这一点。)

'WRITE DATA TO TEXT FILE
Dim f As ADODB.Field
Dim myFileSystemObject As Object
Dim txtfile As Object
Dim str As String

Set myFileSystemObject = CreateObject("Scripting.FileSystemObject")
Set txtfile = myFileSystemObject.CreateTextFile(strPath & ".txt", True)

With adRs
    For Each f In .Fields
        txtfile.Write (f.Name)
        txtfile.Write Chr(9)
    Next

    txtfile.WriteLine

    'Add in leading double quote, double quote text qualifier throughout, and changing carriage returns to double quote/carriage return/double quote.
    str = Chr(34) & Replace(Replace(adRs.GetString, vbTab, Chr(34) & "," & Chr(34)), vbCr, Chr(34) & vbCr & Chr(34))

    'Write to file removing last extraneous double quote.
    txtfile.Write Mid(str, 1, Len(str) - 1)

    .Close
    End With

txtfile.Close

所有代码正在做的是添加"到字符串的开头,将所有以前的制表符更改为","并将所有回车更改为" [cr]"。写入功能中的替换应该删除最后一个无关的"。文本分隔符应该保护任何嵌入的逗号。再一次,不确定性能,但似乎你可以得到一些东西并运行到你的规格,至少。

编辑:如果这不能让你到达目的地,你也可以查看SQL Server的BCP实用程序。

答案 1 :(得分:0)

这里比赛的后期 - 这个问题超过3岁! - 并且你得到了一个答案,它实现了在引号中封装文本字段,这是一个称职答案的标记。

但是,很少有解决这三大问题的答案:

  1. 优化VBA笨重的字符串处理;
  2. 字节顺序标记,VBA嵌入其保存的每个字符串中 文件,确保csv文件的一些最常见的使用者 无法正确阅读;
  3. ......写文件之间从来没有任何中间立场 逐行,永远,并把它写在一个大块,扔掉 内存不足错误。
  4. VBA中的初学者可能会发现字符串优化很难理解:本机VBA中可用的最大性能增益是避免字符串分配和连接(这就是为什么:http://www.aivosto.com/vbtips/stringopt2.html#huge) - 所以我使用join,split和替换而不是myString = MyString & MoreString

    尾随循环,最后调用RecordSet.GetRows(),会引起对结构化编程强烈意见的编码人员的注意。

    所以这里是:

    
     Public Function RecordsetToCSV(ByRef rst As ADODB.Recordset, _
                                    ByRef OutputFile As String, _
                                    Optional ByRef FieldList As Variant, _
                                    Optional ByVal CoerceText As Boolean = True, _
                                    Optional ByVal CleanupText As Boolean = True _
                                    ) As Long

    ' Output a recordset to a csv file and returns the row count.

    ' If the output file is locked, or specified in an inaccessible location, the ' 'ByRef' OutputFile parameter becomes a file in the user's local temp folder

    ' You can supply your own field list. This isn't a substituted file header of ' aliased field names: it is a subset of the field names, which ADO will read ' selectively from the recordset. Each item in the list matches a named field

    ' CoerceText=TRUE will encapsulate all items, numeric or not, in quote marks. ' CleanupText=TRUE strips quotes and linefeeds from the data: FALSE is faster

    ' You should only set them FALSE if you're confident that the data is 'clean' ' with no quote marks, commas or line breaks in any unencapsulated text field

    ' This code handles unicode, and outputs a file that can be read by Microsoft ' ODBC and OLEDB database drivers by removing the Byte Order Marker.

    On Error Resume Next

    ' Coding note: we're not doing any string-handling in VBA.Strings: allocating ' deallocating and (especially!) concatenating are SLOW. We are using the VBA ' Join and Split functions ONLY. Feel free to optimise further by declaring a ' faster set of string functions from the Kernel if you want to. ' ' Other optimisations: type pun. Byte Arrays are interchangeable with strings ' Some of our loops through these arrays have a 'step' of 2. This optimises a ' search-and-replace for ANSI chars in an array of 2-byte unicodes. Note that ' it's only used to remove known ANSI 'Latin' characters with a 'low' byte of ' zero: any other use of the two-byte 'step' will fail on non-Latin unicodes.

    ' ** THIS CODE IS IN THE PUBLIC DOMAIN ** ' Nigel Heffernan Excellerando.Blogspot.com

    Const FETCH_ROWS As Long = 4096

    Dim COMMA As String * 1 Dim BLANK As String * 4 Dim EOROW As String * 2

    COMMA = ChrW$(44) BLANK = ChrW$(13) & ChrW$(10) & ChrW$(13) & ChrW$(10) EOROW = ChrW$(13) & ChrW$(10)

    Dim FetchArray As Variant

    Dim i As Long ' i for rows in the output file, records in the recordset Dim j As Long ' j for columns in the output file, fields in the recordset Dim k As Long ' k for all other loops: bytes in individual data items

    Dim i_Offset As Long

    Dim i_LBound As Long Dim i_UBound As Long Dim j_LBound As Long Dim j_UBound As Long Dim k_lBound As Long Dim k_uBound As Long

    Dim hndFile As Long Dim varField As Variant

    Dim iRowCount As Long Dim arrBytes() As Byte Dim arrTemp1() As String Dim arrTemp2() As String Dim arrTemp3(0 To 2) As String

    Dim boolNumeric As Boolean

    Dim strHeader As String Dim arrHeader() As Byte

    Dim strFile As String Dim strPath As String Dim strExtn As String

    strFile = FileName(OutputFile) strPath = FilePath(OutputFile) strExtn = FileExtension(strFile)

    If rst Is Nothing Then Exit Function If rst.State <> 1 Then Exit Function

    If strExtn = "" Then strExtn = ".csv" End If

    With FSO

    If strFile = "" Then
        strFile = .GetTempName
        strFile = Left(strFile, Len(strFile) - Len(".tmp"))
        strFile = strFile & strExtn
    End If
    
    If strPath = "" Then
        strPath = TempSQLFolder
    End If
    
    If Right(strPath, 1) <> "\" Then
        strPath = strPath & "\"
    End If
    
    strExtn = FileExtension(strFile)
    If strExtn = "" Then
        strExtn = ".csv"
        strFile = strFile & strExtn
    End If
    
    
    OutputFile = strPath & strFile
    

    End With

    If FileName(OutputFile) <> "" Then If Len(VBA.FileSystem.Dir(OutputFile, vbNormal)) <> 0 Then

        Err.Clear
        VBA.FileSystem.Kill OutputFile  ' do it now, and reduce wait for deletion
        If Err.Number = 70 Then  ' permission denied: change the output file name
            OutputFile = FileStripExtension(OutputFile) & "_" & FileStripExtension(FSO.GetTempName) & FileExtension(OutputFile)
        End If
    
    End If
    

    End If

    ' ChrW$() gives a 2-byte 'Wide' char. This coerces all subsequent operations to UTF16

    arrTemp3(0) = ChrW$(34) ' Encapsulating quote arrTemp3(1) = vbNullString ' The field value will go here arrTemp3(2) = ChrW$(34) ' Encapsulating quote

    If rst.EOF And rst.BOF Then FetchArray = Empty ElseIf rst.EOF Then rst.MoveFirst End If

    ' An empty recordset must still write a header row of field names: we put this in the ' output buffer and write it to the file before we start looping through the records.

    ReDim FetchArray(0 To rst.Fields.Count, 0 To 0)

    i_LBound = 0 i_UBound = 0

    If IsMissing(FieldList) Then

    For j = LBound(FetchArray, 1) To UBound(FetchArray, 1) - 1 Step 1
        FetchArray(j, i_UBound) = rst.Fields(j).Name
    Next j
    

    Else

    j = 0
    
    For Each varField In FieldList
        j_UBound = j_UBound + 1
    Next varField
    
    ReDim arrTemp2(j_LBound To j_UBound)
    For Each varField In FieldList
        FetchArray(j, i_UBound) = CStr(varField)
        j = j + 1
    Next varField
    

    End If

    ReDim arrTemp1(i_LBound To i_UBound) ' arrTemp1 is the rowset we write to file ReDim arrTemp2(j_LBound To j_UBound) ' arrTemp2 represents a single record

    Do Until IsEmpty(FetchArray)

    i_LBound = LBound(FetchArray, 2)
    i_UBound = UBound(FetchArray, 2)
    
    j_LBound = LBound(FetchArray, 1)
    j_UBound = UBound(FetchArray, 1)
    
    If UBound(arrTemp1) <> i_UBound + 1 Then
        ReDim arrTemp1(i_LBound To i_UBound + 1)
        arrTemp1(i_UBound + 1) = vbNullString   ' The 'Join' operation will insert a trailing row
    End If                                      ' delimiter here (Not required by the last chunk)
    
    If UBound(arrTemp2) <> j_UBound Then
        ReDim arrTemp2(j_LBound To j_UBound)
    End If
    
    
    ' Data body. This is heavily optimised to avoid VBA String functions with allocations
    
    For i = i_LBound To i_UBound Step 1
    
        ' If this is confusing... Were you expecting FetchArray(i,j)? i for row, j for column?
        ' FetchArray comes from RecordSet.GetRows(), which returns a TRANSPOSED array: i and j
        ' are still the field and record ordinals, row(i) and column(j) in the output file.
    
        For j = j_LBound To j_UBound
    
            If IsNull(FetchArray(j, i)) Then
                arrTemp2(j) = ""
            Else
                arrTemp2(j) = FetchArray(j, i)  ' confused? see he note above
            End If
    
            If CleanupText Or (i_UBound = 0) Then  ' (i_UBound=0): always clean up field names
    
                arrBytes = arrTemp2(j) ' Integer arithmetic is faster than string-handling for
                                       ' this: all VBA string operations require an allocation
    
                For k = LBound(arrBytes) To UBound(arrBytes) Step 2
    
                    Select Case arrBytes(k)
                    Case 10, 13, 9, 160
                        If arrBytes(k + 1) = 0 Then
                            arrBytes(k) = 32 ' replaces CR, LF, Tab, and non-breaking
                        End If               ' spaces with the standard ANSI space
                    Case 44
                        If Not CoerceText Then
                            If arrBytes(k + 1) = 0 Then
                                arrBytes(k) = 32 ' replace comma with the ANSI space
                            End If
                        End If
                    Case 34
                        If arrBytes(k + 1) = 0 Then
                            arrBytes(k) = 39  ' replaces double-quote with single quote
                        End If
                    End Select
    
                Next k
    
                arrTemp2(j) = arrTemp2(j)
    
            End If  ' cleanup
    
    
            If CoerceText Then  ' encapsulate all fields in quotes, numeric or not
    
               arrTemp3(1) = arrTemp2(j)
               arrTemp2(j) = Join$(arrTemp3, vbNullString)
    
            ElseIf (i = 0) And (i = i_UBound) Then ' always encapsulate field names
    
               arrTemp3(1) = arrTemp2(j)
               arrTemp2(j) = Join$(arrTemp3, vbNullString)
    
            Else ' selective encapsulation, leaving numeric fields unencapsulated:
                 ' we *could* do this by reading the ADODB field types: but that's
                 ' slower, and you may be 'caught out' by provider-specific types.
    
                arrBytes = arrTemp2(j)
    
                boolNumeric = True
    
                For k = LBound(arrBytes) To UBound(arrBytes) Step 2
                    If arrBytes(k) < 43 Or arrBytes(k) > 57 Then
    
                        If arrBytes(k) <> 69 Then
                            boolNumeric = False
                            Exit For
                        Else
                            If k > UBound(arrBytes) - 5 Then
                                boolNumeric = False
                                Exit For
                            ElseIf arrBytes(k + 2) = 45 Then
                                ' detect "1.234E-05"
                            ElseIf arrBytes(k + 2) = 43 Then
                                ' detect "1.234E+05"
                            Else
                                boolNumeric = False
                                Exit For
                            End If
                        End If
    
                    End If
                Next k
    
                If boolNumeric Then
                   For k = 1 + LBound(arrBytes) To UBound(arrBytes) Step 2
                       If arrBytes(k) <> 0 Then
                           boolNumeric = False
                           Exit For
                       End If
                   Next k
                End If
    
               arrBytes = vbNullString
    
               If Not boolNumeric Then ' text field, encapsulate it
                   arrTemp3(1) = arrTemp2(j)
                   arrTemp2(j) = Join(arrTemp3, vbNullString)
               End If
    
            End If ' CoerceText
    
        Next j
    
       arrTemp1(i) = Join(arrTemp2, COMMA)
    
    Next i
    
    iRowCount = iRowCount + i - 2
    
    
    '   **** WHY WE 'PUT' A BYTE ARRAY INSTEAD OF A VBA STRING VARIABLE  **** ****
    '
    '       Put #hndFile, , StrConv(Join(arrTemp1, EOROW), vbUnicode)
    '       Put #hndFile, , Join(arrTemp1, EOROW)
    '
    '   If you pass unicode, Wide or UTF-16 string variables to PUT, it prepends a
    '   Unicode Byte Order Mark to the data which, when written to your file, will
    '   render the field names illegible to Microsoft's JET ODBC and ACE-OLEDB SQL
    '   drivers (which can actually read unicode field names, if the helpful label
    '   isn't in the way). The primeval 'PUT' statement writes a Byte array as-is.
    '
    '   **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****
    
    
    arrBytes = Join$(arrTemp1, vbCrLf)
    
    
    If hndFile = 0 Then
    
        i_Offset = 1
        If Len(Dir(OutputFile)) > 0 Then
            VBA.FileSystem.Kill OutputFile
        End If
    
        WaitForFileDeletion OutputFile
    
        hndFile = FreeFile
        Open OutputFile For Binary Access Write As #hndFile
    
    End If
    
    
    Put #hndFile, i_Offset, arrBytes
    i_Offset = i_Offset + 1 + UBound(arrBytes)
    Erase arrBytes
    
    
    If rst.EOF Then
        Erase FetchArray
        FetchArray = Empty
    Else
        If IsMissing(FieldList) Then
            FetchArray = rst.GetRows(FETCH_ROWS)
        Else
            FetchArray = rst.GetRows(FETCH_ROWS, , FieldList)
        End If
    End If
    

    Loop ' until isempty(FetchArray)

    If iRowCount < 1 Then ' iRowCount = 0 ' Row Count excludes the header End If

    RecordsetToCSV = iRowCount

    ExitSub:

    On Error Resume Next
    
    If hndFile <> 0 Then
        Close #hndFile
    End If
    
    Erase arrBytes
    Erase arrTemp1
    Erase arrTemp2
    Exit Function
    

    ErrSub:

    Resume ExitSub
    

    End Function

    Public Function FilePath(Path As String) As String ' Strip the filename from a path, leaving only the path to the folder ' The last char of this path will be the backslash

    ' This does not check for the existence or accessibility of the file: ' all we're doing here is string-handling

    Dim strPath As String Dim arrPath() As String

    Const BACKSLASH As String * 1 = "\"

    strPath = Trim(Path)

    If strPath = "" Then Exit Function If Right$(strPath, 1) = BACKSLASH Then Exit Function

    arrPath = Split(strPath, BACKSLASH)

    If UBound(arrPath) = 0 Then ' does not contain "\" FilePath = "" Else arrPath(UBound(arrPath)) = vbNullString FilePath = Join$(arrPath, BACKSLASH) End If

    Erase arrPath

    End Function

    Public Function FileName(Path As String) As String ' Strip the folder and path from a file's path string, leaving only the file name

    ' This does not check for the existence or accessibility of the file: ' all we're doing here is string-handling

    Dim strPath As String Dim arrPath() As String

    Const BACKSLASH As String * 1 = "\"

    strPath = Trim(Path)

    If strPath = "" Then Exit Function If Right$(strPath, 1) = BACKSLASH Then Exit Function

    arrPath = Split(strPath, BACKSLASH)

    If UBound(arrPath) = 0 Then ' does not contain "\" FileName = Path Else FileName = arrPath(UBound(arrPath)) End If

    Erase arrPath

    End Function

    Public Function FileExtension(Path As String) As String ' Return the extension of the file

    ' This is just string-handling: no file or path validation is attempted ' The file extension is deemed to be whatever comes after the final '.' ' The extension is returned with the dot, eg: ".txt" not "txt" ' If no extension is detected, FileExtension returns an empty string

    Dim strFile As String Dim arrFile() As String Const DOT_EXT As String * 1 = "."

    strFile = FileName(Path) strFile = Trim(strFile)

    If strFile = "" Then Exit Function If Right$(strFile, 1) = DOT_EXT Then Exit Function

    arrFile = Split(strFile, DOT_EXT)

    If UBound(arrFile) = 0 Then ' does not contain "\" FileExtension = vbNullString Else FileExtension = arrFile(UBound(arrFile)) FileExtension = Trim(FileExtension) If Len(FileExtension) > 0 Then FileExtension = DOT_EXT & FileExtension End If End If

    Erase arrFile

    End Function

    Public Function FileStripExtension(Path As String) As String ' Return the filename, with the extension removed

    ' This is just string-handling: no file validation is attempted ' The file extension is deemed to be whatever comes after the final '.' ' Both the dot and the extension are removed

    Dim strFile As String Dim arrFile() As String Const DOT_EXT As String * 1 = "."

    strFile = FileName(Path)

    If strFile = "" Then Exit Function If Right$(strFile, 1) = DOT_EXT Then Exit Function

    strFile = Trim(strFile)

    arrFile = Split(strFile, DOT_EXT)

    If UBound(arrFile) = 0 Then ' does not contain "\" FileStripExtension = vbNullString Else ReDim Preserve arrFile(LBound(arrFile) To UBound(arrFile) - 1) FileStripExtension = Join$(arrFile, DOT_EXT) End If

    Erase arrFile

    If strFile = "" Then strFile = .GetTempName strFile = Left(strFile, Len(strFile) - Len(".tmp")) strFile = strFile & strExtn End If If strPath = "" Then strPath = TempSQLFolder End If If Right(strPath, 1) <> "\" Then strPath = strPath & "\" End If strExtn = FileExtension(strFile) If strExtn = "" Then strExtn = ".csv" strFile = strFile & strExtn End If OutputFile = strPath & strFile

    如果您还没有自己的版本,还需要三个路径和文件名实用程序功能:

    • 文件名()
    • 文件路径()
    • FileStripExtension()

    字符串封装逻辑还有改进空间:正确方法是查找记录集的字段类型并相应地应用引号,结果可能比我的笨重更快字节计数方法。但是,我的方法是关于文件使用者以及他们期望看到的内容;并不总是与应该接受的内容对齐。

    如果您成功编写了更快更强大的版本,请告诉我。