从VB.NET中的另一个arraylist中删除arraylist中的项目

时间:2013-08-27 19:20:19

标签: sql-server arrays vb.net tsql arraylist

我正在编写一个VB.NET webforms网站,其中一页必须将文件列表加载到列表框中。它需要将所有PDF和TIF文件加载到数据库中没有条目的目录中。我正在使用以下代码成功完成此操作。基本上,我查询数据库以获取文件名条目的arraylist,然后遍历目录中的每个文件,针对arraylist中的每个条目检查其名称,如果其名称不在arraylist中,则将其添加到list以绑定到列表框:

    Dim category As String = "RFQ"

    'Initialize database connection variables
    Dim sql As String
    Dim query As System.Data.SqlClient.SqlCommand
    Dim result As System.Data.SqlClient.SqlDataReader

    'Load document list from database
    Dim savedfiles As New ArrayList
    database.Open() 'Open connection to database  
    sql = "SELECT filename FROM fileheaders WHERE [category] = '" & category & "'" 'SQL query to read file header information
    query = New System.Data.SqlClient.SqlCommand(sql, database) 'Create query to send to database
    result = query.ExecuteReader() 'Execute query
    While result.Read()
        savedfiles.Add(row(result, "filename"))
    End While
    result.Close()
    dbDocscan.Close()


    'The following code section pulls all files from the current file directory.
    Dim filelist = New ArrayList
    Dim dir As New System.IO.DirectoryInfo(dirName) 'Get directory information
    Dim files As System.IO.FileInfo() = dir.GetFiles() 'Get all files in directory
    Dim file As System.IO.FileInfo
    Dim i As Integer = 0
    For Each file In files
        If ((file.Extension Like ".pdf") Or (file.Extension Like ".tif")) And Not inArray(savedfiles, file.Name) Then
            filelist.Add(file.Name) 'Add .pdf and .tif files to list of documents
        End If
    Next

    filelist.TrimToSize()
    eltFilelist.DataSource = filelist
    eltFilelist.DataBind() 'Bind document list to listbox

然后是inArray函数代码:

Function inArray(arr As ArrayList, str As String) As Boolean
    For Each item In arr
        If TypeOf (item) Is String Then
            If str = item Then
                Return True
                Exit Function
            End If
        End If
    Next
    Return False
End Function

问题在于:虽然它有效但看起来非常低效。目录中有大约27,000个文件,数据库中有大约26,000个文件条目。所以我在26,000个名字的列表中检查每个27,000个文件名。没有将其变成组合问题,那就是数以亿计的字符串匹配语句。有没有更有效的方法来解决这个问题?

2 个答案:

答案 0 :(得分:0)

使用Dictionary或HashTable来保存查询中的文件名,而不是使用ArrayList。

您的inArray函数正在为找到的每个文件执行O(n)表扫描,这非常慢。

Dictionaries和HashTables都有一个Contains成员,可以更快的速度搜索您的文件名。

答案 1 :(得分:0)

您可以使用SQL参数来避免类别字符串出现问题(例如,如果其中包含撇号,则会破坏连接的查询字符串),只获取目录中具有您感兴趣的扩展名的文件,您可以使用LINQ以简单的方式获取丢失的文件:

Imports System.Data.SqlClient
Imports System.IO
Module Module1
    Function GetMissingFiles(sourceDirectory As String, category As String) As List(Of String)
        Dim missingFiles As New List(Of String)

        Dim filesInDatabase As New List(Of String)

        ' Query the database for the files in the given category'
        Using conn As New SqlConnection("connection string here")
            conn.Open()
            Dim sqlCmd As String = "SELECT filename FROM fileheaders WHERE [category] = @category"
            Dim query As New System.Data.SqlClient.SqlCommand(sqlCmd, conn)
            'TODO: change .SqlDbType to what it is in the database.'
            query.Parameters.Add(New SqlParameter With {.ParameterName = "@category", .SqlDbType = SqlDbType.NVarChar, .Value = category})

            Dim rdr As SqlDataReader = query.ExecuteReader()

            While rdr.Read()
                filesInDatabase.Add(rdr.GetString(0))
            End While

            conn.Close()

        End Using

        'TODO: it could be that filesInDatabase.Count = 0 is valid. Adjust if required.'
        If filesInDatabase.Count > 0 Then
            ' Get the existing files from the given directory.

            ' the extensions we are going to consider
            Dim extensions() As String = {"pdf", "tif"}

            Dim existingFiles As New List(Of String)

            ' get all the filenames (without the path) to consider'
            For Each extn In extensions
                existingFiles.AddRange(Directory.GetFiles(sourceDirectory, "*." & extn).ToList().Select(Function(p) Path.GetFileName(p)))
            Next

            missingFiles = existingFiles.Except(filesInDatabase).ToList()

        End If

        Return missingFiles

    End Function
    Sub Whatever()
        Dim myMissingFiles As List(Of String)
        Try
            myMissingFiles = GetMissingFiles("C:\temp", "RFQ")
        Catch ex As Exception
            ' Inform user it went wrong.'
        End Try

        If myMissingFiles IsNot Nothing AndAlso myMissingFiles.Count > 0 Then
            eltFilelist.DataSource = myMissingFiles
            eltFilelist.DataBind()
        End If

    End Sub

End Module