Question

我有一个500K行的数据表，格式如下;

Int | Decimal | String

我们正在使用单身模式，最终我们的DataTable需要最终为List(Of AssetAllocation)，其中AssetAllocation为：

Public Class AssetAllocation
    Property TpId() As Integer
    Property Allocation As List(Of Sector)
End Class

Public Class Sector
    Property Description() As String
    Property Weighting As Decimal
End Class

我正在使用的linq;

Private Shared Function LoadAll() As List(Of AssetAllocation)

        Dim rtn = New List(Of AssetAllocation)

        Using dt = GetRawData()

            Dim dist = (From x In dt.AsEnumerable Select x!TP_ID).ToList().Distinct()

            rtn.AddRange(From i As Integer In dist
                         Select New AssetAllocation With {
                            .TpId = i,
                            .Allocation = (From b In dt.AsEnumerable
                                           Where b!TP_ID = i Select New Sector With {
                                               .Description = b!DESCRIPTION.ToString(),
                                               .Weighting = b!WEIGHT
                                           }).ToList()})
        End Using

        Return rtn
    End Function

执行linq需要很长时间，这是由于构造扇区列表的内部查询所致。不同的列表包含80k

这可以改善吗？

Answer 1

如果我已经了解您要执行的操作，则此查询应该具有更好的性能。诀窍是使用GroupBy来避免在每次迭代中搜索整个表以匹配id。我用C＃编写了它，但我确定你可以把它翻译成VB。

var rtn  = 
        dt.AsEnumerable()
        .GroupBy(x => x.Field<int>("TP_ID"))
        .Select(x => new AssetAllocation()
        { 
            TpId = x.Key, 
            Allocation = x.Select(y => new Sector
            {
                Description =  y.Field<string>("Description"),
                Weighting = y.Field<decimal>("WEIGHT") 
            }).ToList()
        }).ToList();

将Linq提升到数据表性能

1 个答案: