我有一个500K行的数据表,格式如下;
Int | Decimal | String
我们正在使用单身模式,最终我们的DataTable
需要最终为List(Of AssetAllocation)
,其中AssetAllocation
为:
Public Class AssetAllocation
Property TpId() As Integer
Property Allocation As List(Of Sector)
End Class
Public Class Sector
Property Description() As String
Property Weighting As Decimal
End Class
我正在使用的linq;
Private Shared Function LoadAll() As List(Of AssetAllocation)
Dim rtn = New List(Of AssetAllocation)
Using dt = GetRawData()
Dim dist = (From x In dt.AsEnumerable Select x!TP_ID).ToList().Distinct()
rtn.AddRange(From i As Integer In dist
Select New AssetAllocation With {
.TpId = i,
.Allocation = (From b In dt.AsEnumerable
Where b!TP_ID = i Select New Sector With {
.Description = b!DESCRIPTION.ToString(),
.Weighting = b!WEIGHT
}).ToList()})
End Using
Return rtn
End Function
执行linq需要很长时间,这是由于构造扇区列表的内部查询所致。不同的列表包含80k
这可以改善吗?
答案 0 :(得分:1)
如果我已经了解您要执行的操作,则此查询应该具有更好的性能。诀窍是使用GroupBy
来避免在每次迭代中搜索整个表以匹配id。
我用C#编写了它,但我确定你可以把它翻译成VB。
var rtn =
dt.AsEnumerable()
.GroupBy(x => x.Field<int>("TP_ID"))
.Select(x => new AssetAllocation()
{
TpId = x.Key,
Allocation = x.Select(y => new Sector
{
Description = y.Field<string>("Description"),
Weighting = y.Field<decimal>("WEIGHT")
}).ToList()
}).ToList();