导入大型CSV文件时出现OutOfMemoryException

时间:2018-04-10 19:11:08

标签: powershell

导入大型电子表格时,我不断获得System.OutOfMemoryException类型的例外(行数大约为400,000)。我的代码基本上如下:

# Build the sqlbulkcopy connection, and set the timeout to infinite 
$CsvFile = "somefile.csv"
$BatchSize = 50000
$BulkCopy = New-Object Data.SqlClient.SqlBulkCopy($ConnectionString, [System.Data.SqlClient.SqlBulkCopyOptions]::TableLock) 
$BulkCopy.DestinationTableName = $Table 
$BulkCopy.bulkcopyTimeout = 0 
$BulkCopy.batchsize = $BatchSize 

# Create the datatable, and auto-generate the columns. 
$NumLines = (Get-Content ${DataPath}\$CsvFile | Measure-Object -Line).Lines - 1 # Exclude header row
$DataTable = New-Object System.Data.DataTable 
$AllRecords = Import-Csv -Path ${DataPath}\$CsvFile -Delimiter $CsvDelimiter

for ($Counter = 0; $Counter -lt $AllRecords.Length; $Counter++) {
    $Object = $AllRecords[$Counter]
    $Dr = $DataTable.NewRow()   
    foreach ($Property in $Object.PsObject.get_properties()) {   
        $Dr.Item($Property.Name) = $Property.Value 
    }
    $DataTable.Rows.Add($Dr)
    $First = $false 

    if ($Counter % $BatchSize -eq 0) {
        $BulkCopy.WriteToServer($DataTable) 
        $DataTable.Clear() 
        # [System.GC]::Collect()
        $Pct = $Counter / $NumLines * 100
    }
}

# Add in all the remaining rows since the last clear
if ($DataTable.Rows.Count -gt 0) { 
    $BulkCopy.WriteToServer($DataTable) 
    $DataTable.Clear() 
    # [System.GC]::Collect()
    $Pct = $Counter / $NumLines * 100
} 

# Clean Up 
if ($NumLines -ge 1) {
    $AllRecords.Clear()
    $AllRecords.Dispose()
    $BulkCopy.Close()
    $BulkCopy.Dispose() 
    $DataTable.Dispose() 
}

# Sometimes the Garbage Collector takes too long to clear the huge datatable. 
[System.GC]::Collect()

当我们处理了等于$DataTable.Clear()(在这种情况下为50,000)的行数时,我会包含$BatchSize。我还尝试在DataTable之后添加[System.GC]::Collect(),以确定这是否有用,但没有骰子。任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:0)

我认为Import-Excel模块对此更有效。你可以从here获得它。