将巨大的550000+行CSV文件导入Access

时间:2010-02-03 09:47:34

标签: ms-access csv

我有一个包含550000+行的CSV文件,我需要将这些数据导入Access,但是当我尝试它时会抛出一个文件太大(1.7GB)的错误,你能推荐一种方法来获取这个文件进入Access?

谢谢,

的Darryl

7 个答案:

答案 0 :(得分:1)

尝试链接而不是导入(“获取外部数据” - > 2003年的“链接表”),将数据保留在CSV文件中并直接从原地读取文件。它不限制大小(至少不是1.7 GB附近)。它可能会限制您的一些读取/更新操作,但它至少会让您开始。

答案 1 :(得分:1)

我要么尝试使用CSV ODBC连接器,要么先在一个不太有限的数据库(MySQL,SQL Server)中导入它,然后从那里导入它。

似乎某些版本的访问对MDB文件有2GB的限制,所以无论如何你可能会遇到麻烦。

祝你好运。

答案 2 :(得分:1)

您也可以使用ETL工具。 Kettle是一个开源的(http://kettle.pentaho.org/),非常容易使用。要将文件导入数据库,需要使用两个步骤进行单个转换:CSV文本输入和表格输出。

答案 3 :(得分:1)

为什么使用大文件访问?使用sqlexpress或firebird代替

答案 4 :(得分:0)

我记得Access在2 Go附近有一些大小限制。去免费SQLExpress(限4 Go)或免费MySQL(没有大小限制)可能更容易。

答案 5 :(得分:0)

另一种选择是取消标准导入功能并编写自己的功能。我之前已经这样做了一次,在导入之前需要将一些特定的逻辑应用于数据。基本结构是......

打开然后文件 获得第一行 循环直到行尾 如果我们找到逗号,则转到下一个字段 将记录放入数据库 获取下一行重复等

我把它包装成一个每100行提交一次的事务,因为我发现在我的情况下提高了性能,但如果有帮助的话,这将取决于你的数据。

但是我会说像其他人所说的那样链接数据是最好的解决方案,如果您必须要访问数据,这只是一个选项

答案 6 :(得分:0)

访问会产生大量开销,因此,即使是相对较小的数据集也可能使文件膨胀到2GB,然后它将关闭。这里有几种简单的导入方法。我没有在大型文件上进行测试,但是这些概念肯定可以在常规文件上使用。

Import data from a closed workbook (ADO)

If you want to import a lot of data from a closed workbook you can do this with ADO and the macro below. If you want to retrieve data from another worksheet than the first worksheet in the closed workbook, you have to refer to a user defined named range. The macro below can be used like this (in Excel 2000 or later):
GetDataFromClosedWorkbook "C:\FolderName\WorkbookName.xls", "A1:B21", ActiveCell, False
GetDataFromClosedWorkbook "C:\FolderName\WorkbookName.xls", "MyDataRange", Range ("B3"), True

Sub GetDataFromClosedWorkbook(SourceFile As String, SourceRange As String, _
    TargetRange As Range, IncludeFieldNames As Boolean)
' requires a reference to the Microsoft ActiveX Data Objects library
' if SourceRange is a range reference:
'   this will return data from the first worksheet in SourceFile
' if SourceRange is a defined name reference:
'   this will return data from any worksheet in SourceFile
' SourceRange must include the range headers
'
Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset
Dim dbConnectionString As String
Dim TargetCell As Range, i As Integer
    dbConnectionString = "DRIVER={Microsoft Excel Driver (*.xls)};" & _
        "ReadOnly=1;DBQ=" & SourceFile
    Set dbConnection = New ADODB.Connection
    On Error GoTo InvalidInput
    dbConnection.Open dbConnectionString ' open the database connection
    Set rs = dbConnection.Execute("[" & SourceRange & "]")
    Set TargetCell = TargetRange.Cells(1, 1)
    If IncludeFieldNames Then
        For i = 0 To rs.Fields.Count - 1
            TargetCell.Offset(0, i).Formula = rs.Fields(i).Name
        Next i
        Set TargetCell = TargetCell.Offset(1, 0)
    End If
    TargetCell.CopyFromRecordset rs
    rs.Close
    dbConnection.Close ' close the database connection
    Set TargetCell = Nothing
    Set rs = Nothing
    Set dbConnection = Nothing
    On Error GoTo 0
    Exit Sub
InvalidInput:
    MsgBox "The source file or source range is invalid!", _
        vbExclamation, "Get data from closed workbook"
End Sub


Another method that doesn't use the CopyFromRecordSet-method

With the macro below you can perform the import and have better control over the results returned from the RecordSet.

Sub TestReadDataFromWorkbook()
' fills data from a closed workbook in at the active cell
Dim tArray As Variant, r As Long, c As Long
    tArray = ReadDataFromWorkbook("C:\FolderName\SourceWbName.xls", "A1:B21")
    ' without using the transpose function
    For r = LBound(tArray, 2) To UBound(tArray, 2)
        For c = LBound(tArray, 1) To UBound(tArray, 1)
            ActiveCell.Offset(r, c).Formula = tArray(c, r)
        Next c
    Next r
    ' using the transpose function (has limitations)
'    tArray = Application.WorksheetFunction.Transpose(tArray)
'    For r = LBound(tArray, 1) To UBound(tArray, 1)
'        For c = LBound(tArray, 2) To UBound(tArray, 2)
'            ActiveCell.Offset(r - 1, c - 1).Formula = tArray(r, c)
'        Next c
'    Next r
End Sub

Private Function ReadDataFromWorkbook(SourceFile As String, SourceRange As String) As Variant
' requires a reference to the Microsoft ActiveX Data Objects library
' if SourceRange is a range reference:
'   this function can only return data from the first worksheet in SourceFile
' if SourceRange is a defined name reference:
'   this function can return data from any worksheet in SourceFile
' SourceRange must include the range headers
' examples:
' varRecordSetData = ReadDataFromWorkbook("C:\FolderName\SourceWbName.xls", "A1:A21")
' varRecordSetData = ReadDataFromWorkbook("C:\FolderName\SourceWbName.xls", "A1:B21")
' varRecordSetData = ReadDataFromWorkbook("C:\FolderName\SourceWbName.xls", "DefinedRangeName")
Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset
Dim dbConnectionString As String
    dbConnectionString = "DRIVER={Microsoft Excel Driver (*.xls)};ReadOnly=1;DBQ=" & SourceFile
    Set dbConnection = New ADODB.Connection
    On Error GoTo InvalidInput
    dbConnection.Open dbConnectionString ' open the database connection
    Set rs = dbConnection.Execute("[" & SourceRange & "]")
    On Error GoTo 0
    ReadDataFromWorkbook = rs.GetRows ' returns a two dim array with all records in rs
    rs.Close
    dbConnection.Close ' close the database connection
    Set rs = Nothing
    Set dbConnection = Nothing
    On Error GoTo 0
    Exit Function
InvalidInput:
    MsgBox "The source file or source range is invalid!", vbExclamation, "Get data from closed workbook"
    Set rs = Nothing
    Set dbConnection = Nothing
End Function

对于很大的文件,您可以尝试这样的操作。 。

INSERT INTO [Table] (Column1, Column2)
SELECT *
FROM [Excel 12.0 Xml;HDR=No;Database=C:\your_path\excel.xlsx].[SHEET1$];

OR

SELECT * INTO [NewTable]
FROM [Excel 12.0 Xml;HDR=No;Database=C:\your_path\excel.xlsx].[SHEET1$];