在抓取XML时,VB.NET For Loop太慢了

时间:2013-07-18 15:37:29

标签: xml vb.net loops for-loop

我从XML文件(资源文件)获取值,并基本上将它们插入到数据表中。我有679个密钥从资源文件中获取,这需要3.41秒。我想知道是否有任何方法可以使这个循环更快。

我已经尝试过Parallel.For循环,但我发现它不稳定,因为它在前一个插入没有完成时开始插入一行。我使用了synch block,但速度又回到了3.41。

 For idx As Integer = 0 To KeyNames.Length - 1
        With KeyManagerResource.Instance
            DataTableManager.Instance.InsertRow(KeyNames(idx), .GetKeyValue(KeyNames(idx), DynamicProperties.Instance.EnglishResourcePath), _
                                                               .GetKeyValue(KeyNames(idx), DynamicProperties.Instance.FrenchResourcePath))
        End With
    Next
 ''' <summary>
''' Gets the value of the key.
''' </summary>
''' <param name="ID">ID of the key.</param>
''' <returns>Value of the key.</returns>
''' <remarks></remarks>
Overrides Function GetKeyValue(ID As String, File As String) As String

   'Sets the current path of the XMLReader to the english file.
    XMLManager.Instance.SetReaderPath(File)

    Dim returnedNode As Xml.XmlNode = XMLManager.Instance.GetNode(String.Format("//data" & Helper.CaseInsensitiveSearch("name"), "'" & ID.ToLower & "'"))

    If returnedNode IsNot Nothing Then
        Return returnedNode.ChildNodes(1).InnerText
    Else
        Return ""
    End If

End Function

 ''' <summary>
''' Adds a row to the target table.
''' </summary>
''' <param name="RowValues">The row values we want to insert. These are in order, so it is presumed the first row value in the array is for the first column 
''' of the target data table.</param>
''' <remarks></remarks>
Public Sub InsertRow(ByVal ParamArray RowValues() As String)

    'If the length of the RowValues is not equal the columns, that means that is an invalid insert. Throw exception.
    If RowValues.Length = dtTargetTable.Columns.Count Then

        'Creates a new row.
        Dim drNewRow As DataRow
        drNewRow = dtTargetTable.NewRow

        'Goes through the row values.
        For idx As Integer = 0 To RowValues.Length - 1

            'Store the value for the column.
            drNewRow(dtTargetTable.Columns(idx)) = RowValues(idx)

        Next

        'Only adds the key if the primary key doesn't already exist.
        If dtTargetTable.Rows.Find(RowValues(0)) Is Nothing Then
            'Adds the row to the table.
            dtTargetTable.Rows.InsertAt(drNewRow, 0)
        End If

    Else
        Throw New Exception(String.Format("Invalid insert. The number of row values passed are not equal to the number of columns of the target dataTable." & _
                                          "The number of columns of the target dataTable are {0}.", dtTargetTable.Columns.Count))
    End If

End Sub

1 个答案:

答案 0 :(得分:0)

我有几条建议可以提供帮助:

不要多次使用索引检索键名;使用For Each循环可以减少处理量。

不要在每个循环中交换XML文件;相反,初始化循环外部的实例并将它们传递给适当的方法(不确定实例类型是什么,所以我创建了一个名为XMLManagerInstance的实例)。

不要在每个循环的数据表中检查主键是否存在。相反,在外部循环中保留以前使用的主键列表,如果PK已经存在,则不要打扰任何工作。

这些应该有助于提高你的表现,特别是最后两个。

以下是代码的建议修改:

    Dim KeyNames As List(Of String)

    Dim cPrimaryKeys As New System.Collections.Generic.HashSet(Of String)
    Dim oEnglishFile As XMLManagerInstance
    Dim oFrenchFile As XMLManagerInstance

    oEnglishFile.SetReaderPath(DynamicProperties.Instance.EnglishResourcePath)
    oFrenchFile.SetReaderPath(DynamicProperties.Instance.FrenchResourcePath)

    For Each KeyName As String In KeyNames
        If Not cPrimaryKeys.Contains(KeyName) Then
            cPrimaryKeys.Add(KeyName)
            With KeyManagerResource.Instance
                DataTableManager.Instance.InsertRow(KeyName, .GetKeyValue(KeyName, oEnglishFile), .GetKeyValue(KeyName, oFrenchFile))
            End With
        End If
    Next

''' <summary>
''' Gets the value of the key.
''' </summary>
''' <param name="ID">ID of the key.</param>
''' <returns>Value of the key.</returns>
''' <remarks></remarks>
Public Function GetKeyValue(ID As String, FileInstance As XMLManagerInstance) As String

    Dim returnedNode As Xml.XmlNode = FileInstance.GetNode(String.Format("//data" & Helper.CaseInsensitiveSearch("name"), "'" & ID.ToLower & "'"))

    If returnedNode IsNot Nothing Then
        Return returnedNode.ChildNodes(1).InnerText
    Else
        Return ""
    End If

End Function

''' <summary>
''' Adds a row to the target table.
''' </summary>
''' <param name="RowValues">The row values we want to insert. These are in order, so it is presumed the first row value in the array is for the first column 
''' of the target data table.</param>
''' <remarks></remarks>
Public Sub InsertRow(ByVal ParamArray RowValues() As String)

    'If the length of the RowValues is not equal the columns, that means that is an invalid insert. Throw exception.
    If RowValues.Length = dtTargetTable.Columns.Count Then
        'Creates a new row.
        Dim drNewRow As DataRow

        drNewRow = dtTargetTable.NewRow

        'Goes through the row values.
        For idx As Integer = 0 To RowValues.Length - 1
            'Store the value for the column.
            drNewRow(dtTargetTable.Columns(idx)) = RowValues(idx)
        Next
    Else
        Throw New Exception(String.Format("Invalid insert. The number of row values passed are not equal to the number of columns of the target dataTable." & _
                                          "The number of columns of the target dataTable are {0}.", dtTargetTable.Columns.Count))
    End If

End Sub