从mssql加载一个非常大的表到vb.net内存

时间:2013-12-09 10:18:45

标签: vb.net sql-server-2012 large-data

所以我有一个非常大的表,其中有大约22 mio行..我想将它们全部加载到vb.net应用程序内存中(我有36GB的RAM,所以我应该没问题。)

现在连接没有超时,我想要求加载那个大表的最佳方法。

如果有办法向应用程序提供反馈,那么可能需要几分钟才能完成。

我正常使用sqldatareader ..但是可以在大量数据中使用吗?

我尝试用谷歌搜索一下这个问题..我将整个表加载到内存中的原因是更快地分析它,因为我需要运行一些正则表达式,并对它进行排序,而不是TSQL提供的。

我希望任何人都可以提供帮助,因为我有点坚持这个

1 个答案:

答案 0 :(得分:1)

您应该使用ROW_NUMBER将巨大的结果集划分为更小的块。然后你可以报告每个块的进度。因此,您可以使用BackGroundWorker来更新Label和/或ProgressBar。要确定每个组的大小,您可以先选择total-rowcount。将其用作chunksize的除数(以下示例中为1000):

这是一种填充DataTable并使用LINQ从道格结果中仅选择小组的工作方法:

按住按钮,启动BackGroundWorker

Private Sub SomeButton_Click(sender As System.Object, e As System.EventArgs) Handles Button2.Click
    Me.BackgroundWorker1.RunWorkerAsync()
End Sub

处理DoWork事件以加载数据:

Private Sub BackgroundWorker1_DoWork(sender As Object, e As System.ComponentModel.DoWorkEventArgs) Handles BackgroundWorker1.DoWork
    Dim tblData As New DataTable()
    Dim totalCount = 0
    Dim chunkSize As Int32 = 1000
    Dim countSQL = "SELECT COUNT(*) FROM dbo.tabData"
    Dim dataSql = "WITH CTE AS(SELECT d.*, rn=ROW_NUMBER()OVER(ORDER BY d.idData) FROM dbo.tabData d) SELECT * FROM CTE WHERE RN BETWEEN @RowStart AND @RowEnd;"
    Using con As New SqlConnection(My.Settings.ConnectionString)
        Using cmdCount = New SqlCommand(countSQL, con)
            con.Open()
            totalCount = DirectCast(cmdCount.ExecuteScalar, Integer)
        End Using
        Dim chunks = Enumerable.Range(0, totalCount).
            GroupBy(Function(i) i \ chunkSize).
            Select(Function(grp, index) New With {
                       .RowStart = grp.Min() + 1,
                       .RowEnd = grp.Max() + 1,
                       .GroupNum = index + 1
                   })
        For Each chunk In chunks
            Using cmdData = New SqlCommand(dataSql, con)
                cmdData.Parameters.AddWithValue("@RowStart", chunk.RowStart)
                cmdData.Parameters.AddWithValue("@RowEnd", chunk.RowEnd)
                Using da = New SqlDataAdapter(cmdData)
                    da.Fill(tblData)
                    BackgroundWorker1.ReportProgress(Math.Ceiling(chunk.GroupNum * chunkSize / totalCount))
                End Using
            End Using
        Next
        BackgroundWorker1.ReportProgress(100)  ' all data loaded '
    End Using
End Sub

更新每个块上的标签和/或ProgressBar,最后:

Private Sub BackgroundWorker1_ProgressChanged(sender As Object, e As System.ComponentModel.ProgressChangedEventArgs) Handles BackgroundWorker1.ProgressChanged
    Me.ProgressLabel.Text = e.ProgressPercentage & " Percent loaded"
    Me.ProgressBar1.Value = e.ProgressPercentage
End Sub

Private Sub BackgroundWorker1_RunWorkerCompleted(sender As Object, e As System.ComponentModel.RunWorkerCompletedEventArgs) Handles BackgroundWorker1.RunWorkerCompleted
    Me.ProgressLabel.Text = "100 Percent loaded. Finished."
End Sub