Question

我有一个包含大约180,000条记录的数据库。我正在尝试将pdf文件附加到每个记录中。每个pdf的大小约为250 kb。然而，大约一分钟后，我的程序开始占用大约一GB的内存，我必须停止它。我尝试这样做，因此一旦更新后，每个linq对象的引用都会被删除，但这似乎没有帮助。我怎样才能明确参考？

感谢您的帮助

Private Sub uploadPDFs(ByVal args() As String)
    Dim indexFiles = (From indexFile In dataContext.IndexFiles
                     Where indexFile.PDFContent = Nothing
                     Order By indexFile.PDFFolder).ToList
    Dim currentDirectory As IO.DirectoryInfo
    Dim currentFile As IO.FileInfo
    Dim tempIndexFile As IndexFile

    While indexFiles.Count > 0
        tempIndexFile = indexFiles(0)
        indexFiles = indexFiles.Skip(1).ToList
        currentDirectory = 'I set the directory that I need
        currentFile = 'I get the file that I need
        writePDF(currentDirectory, currentFile, tempIndexFile)
    End While
End Sub

Private Sub writePDF(ByVal directory As IO.DirectoryInfo, ByVal file As IO.FileInfo, ByVal indexFile As IndexFile)
    Dim bytes() As Byte
    bytes = getFileStream(file)
    indexFile.PDFContent = bytes
    dataContext.SubmitChanges()
    counter += 1
    If counter Mod 10 = 0 Then Console.WriteLine("     saved file " & file.Name & " at " & directory.Name)
End Sub


Private Function getFileStream(ByVal fileInfo As IO.FileInfo) As Byte()
    Dim fileStream = fileInfo.OpenRead()
    Dim bytesLength As Long = fileStream.Length
    Dim bytes(bytesLength) As Byte

    fileStream.Read(bytes, 0, bytesLength)
    fileStream.Close()

    Return bytes
End Function

Answer 1

我建议您批量执行此操作，使用Take（调用ToList之前的）一次处理特定数量的项目。阅读（比如说）10，在所有上设置PDFContent，调用SubmitChanges，然后重新开始。（我不确定你是否应该从那时开始使用新的DataContext，但这样做可能是最干净的。）

另外，阅读文件内容的代码至少在几个方面被破坏了 - 但首先使用File.ReadAllBytes会更简单。

此外，您处理列表逐渐缩小的方式效率非常低 - 在获取180,000条记录后，您将构建一个包含179,999条记录的新列表，然后再创建一条包含179,998条记录的新列表等。

Answer 2

DataContext是否将ObjectTrackingEnabled设置为true（默认值）？如果是这样，那么它将尝试保留其接触的基本上所有数据的记录，从而防止垃圾收集器能够收集任何数据。

如果是这样，您应该能够通过定期处理DataContext并创建新的DataContext或关闭对象跟踪来解决问题。

Answer 3

行。要使用最小的内存量，我们必须以块为单位更新datacontext。我在下面放了一个示例代码。由于我正在使用记事本输入，因此可能会出现sytax错误。

    Dim DB as YourDataContext = new YourDataContext
    Dim BlockSize as integer = 25
    Dim AllItems = DB.Items.Where(function(i) i.PDFfile.HasValue=False)

    Dim count = 0
    Dim tmpDB as YourDataContext = new YourDataContext


While (count < AllITems.Count)

    Dim _item = tmpDB.Items.Single(function(i) i.recordID=AllItems.Item(count).recordID)
    _item.PDF = GetPDF()

    Count +=1

    if count mod BlockSize = 0 or count = AllItems.Count then
        tmpDB.SubmitChanges()
         tmpDB =  new YourDataContext
           GC.Collect()
    end if

End While

要进一步优化速度，您可以将recordID作为匿名类型从allitems转换为数组，并为该PDF字段设置DelayLoading。

linq submitchanges内存不足

3 个答案: