我写了一个泛型函数来查找和填充序列中的缺失值,我有几个实现IColumnData的类,所以我打算传递类的集合并期望填充缺少的数据
Private Function FillMissingData(Of T As {IColumnData, New})(data As IEnumerable(Of T), valueFunc As Func(Of Integer, Double)) As IEnumerable(Of T)
Dim range As IEnumerable(Of Integer) = Enumerable.Range(1, 10)
Dim current As IEnumerable(Of T) = data
If current.Count() < range.Count() Then
Dim missingPeriods As IEnumerable(Of Integer) = range.Except(data.Select(Function(d) d.Column))
Dim missingData As IEnumerable(Of T)
missingData = missingPeriods.Select(Function(column) New T() With {.Column = column, .Value = valueFunc(column)})
current = data.Union(missingData).OrderBy(Function(r) r.Column)
End If
Return current
End Function
该功能工作正常,但我对代码不满意,看起来像一团糟,如果范围很大,性能也很差。这段代码预计每天通过aspx页面工作大约100K次。
我正在寻找特定于这段代码的解决方案。
答案 0 :(得分:4)
如果您的数据已按Column
排序,则可以使用如下迭代器方法提高性能:
Private Iterator Function FillMissingData(Of T As {IColumnData, New})(data As IEnumerable(Of T), valueFunc As Func(Of Integer, Double)) As IEnumerable(Of T)
Dim nextExpectedColumn = 1
Dim maxColumn = 10
For Each element As T In data
'Yield the missing elements
For column As Integer = nextExpectedColumn To element.Column - 1
Yield New T() With {.Column = column, .Value = valueFunc(column)}
Next
Yield element
nextExpectedColumn = element.Column + 1
Next
For column As Integer = nextExpectedColumn To maxColumn
Yield New T() With {.Column = column, .Value = valueFunc(column)}
Next
End Function
对于无序数据集,Except
方法可能很慢,因为它需要HashSet
的额外内存。此外,订购新集合为O(n log n)
。上述方法具有线性时间复杂度,但需要对输入数据进行排序。