VBA / ADO:从.csv数据源读取混合数据类型

时间:2012-02-21 13:16:58

标签: sql vba csv ado

我在从.csv数据源读取混合数据类型时遇到问题:当我有一个具有混合字符串/数字值的列时,字符串将返回为Null。我已设置IMEX = 1并将注册表项TypeGuessRows从8更改为0(但即使我在前8行中有混合数据类型,字符串仍然是Null)。还有ImportMixedTypes =注册表中的文本。

我错过了什么?任何想法都非常赞赏。

这是我的连接字符串:

ConnString = "Provider=Microsoft.Jet.OLEDB.4.0;" _
    & "Data Source=" & Folder & ";" _
    & "Extended Properties='text;HDR=YES;FMT=CSVDelimited;IMEX=1';" _
    & "Persist Security Info=False;"

2 个答案:

答案 0 :(得分:1)

这是另一个代码示例,它不使用ADO ,类似于Fink发布的,具有更多的灵活性和错误处理。性能也不错(在我的机器上读取和解析20 MB csv文件的时间不到3秒)。

Public Function getDataFromFile(parFileName As String, parDelimiter As String, Optional parExcludeCharacter As String = "") As Variant
'parFileName is supposed to be a delimited file (csv...)'
'Returns an empty array if file is empty or can't be opened
'number of columns based on the line with the largest number of columns, not on the first line'
'parExcludeCharacter: sometimes csv files have quotes around strings: "XXX" - if parExcludeCharacter = """" then quotes are removed'

  Dim locLinesList() As Variant
  Dim locData As Variant
  Dim i As Long
  Dim j As Long
  Dim locNumRows As Long
  Dim locNumCols As Long
  Dim fso As New FileSystemObject
  Dim ts As TextStream
  Const REDIM_STEP = 10000

  On Error GoTo error_open_file
  Set ts = fso.OpenTextFile(parFileName)
  On Error GoTo unhandled_error

  'Counts the number of lines and finds the largest number of columns'
  ReDim locLinesList(1 To 1) As Variant
  i = 0
  Do While Not ts.AtEndOfStream
    If i Mod REDIM_STEP = 0 Then
      ReDim Preserve locLinesList(1 To UBound(locLinesList, 1) + REDIM_STEP) As Variant
    End If
    locLinesList(i + 1) = Split(ts.ReadLine, parDelimiter)
    j = UBound(locLinesList(i + 1), 1) 'number of columns'
    If locNumCols < j Then locNumCols = j
    i = i + 1
  Loop

  ts.Close

  locNumRows = i

  If locNumRows = 0 Then Exit Function 'Empty file'

  ReDim locData(1 To locNumRows, 1 To locNumCols + 1) As Variant

  'Copies the file into an array'
  If parExcludeCharacter <> "" Then

    For i = 1 To locNumRows
      For j = 0 To UBound(locLinesList(i), 1)
        If Left(locLinesList(i)(j), 1) = parExcludeCharacter Then
          If Right(locLinesList(i)(j), 1) = parExcludeCharacter Then
            locLinesList(i)(j) = Mid(locLinesList(i)(j), 2, Len(locLinesList(i)(j)) - 2)       'If locTempArray = "", Mid returns ""'
          Else
            locLinesList(i)(j) = Right(locLinesList(i)(j), Len(locLinesList(i)(j)) - 1)
          End If
        ElseIf Right(locLinesList(i)(j), 1) = parExcludeCharacter Then
          locLinesList(i)(j) = Left(locLinesList(i)(j), Len(locLinesList(i)(j)) - 1)
        End If
      Next j
    Next i

  Else

    For i = 1 To locNumRows
      For j = 0 To UBound(locLinesList(i), 1)
        locData(i, j + 1) = locLinesList(i)(j)
      Next j
    Next i

  End If

  getDataFromFile = locData

  Exit Function

error_open_file: 'returns empty variant'
unhandled_error: 'returns empty variant'

End Function

答案 1 :(得分:0)

您是否已锁定使用ADO阅读CSV?我似乎总是遇到问题,试图像你正在经历的那样用ADO阅读文本文件。我通常只是放弃ADO端并直接用文本阅读器读取文件以获得更多控制权。

Public Sub TestIt()

    Dim path As String

    path = "C:\test.csv"

    ReadText path
End Sub

Public Sub ReadText(path As String)
'requires reference to 'Microsoft Scripting Runtime' scrrun.dll OR use late binding

    Const DELIM As String = ","
    Dim fso As New Scripting.FileSystemObject
    Dim text As Scripting.TextStream
    Dim line As String
    Dim vals() As String

    Set text = fso.OpenTextFile(path, ForReading)

    Do While Not text.AtEndOfStream

        line = text.ReadLine

        vals = Split(line, DELIM)

        'do something with the values
    Loop

    text.Close
End Sub