我有一个迁移工作,我需要在完成后验证目标数据。为了通知管理员验证成功/失败,我使用一个计数器来比较Database1中表Foo的行数和Database2中表Foo的行数。
Database2中的每一行都根据Database1中的相应行进行验证。为了加快这个过程,我使用Parallel.ForEach
循环。
我最初的问题是,计数总是与我的预期不同。我后来发现+=
和-=
操作不是线程安全的(不是原子的)。为了解决这个问题,我更新了代码,以便在计数器变量上使用Interlocked.Increment
。此代码打印的计数更接近实际计数,但是,每次执行时它似乎都不同,并且它不会给出我期望的结果:
Private countObjects As Integer
Private Sub MyMainFunction()
Dim objects As List(Of MyObject)
'Query with Dapper, unrelevant to the problem.
Using connection As New System.Data.SqlClient.SqlConnection("aConnectionString")
objects = connection.Query("SELECT * FROM Foo") 'Returns around 81000 rows.
End Using
Parallel.ForEach(objects, Sub(u) MyParallelFunction(u))
Console.WriteLine(String.Format("Count : {0}", countObjects)) 'Prints "Count : 80035" or another incorrect count, which seems to differ on each execution of MyMainFunction.
End Sub
Private Sub MyParallelFunction(obj As MyObject)
Interlocked.Increment(countObjects) 'Breakpoint Hit Count is at around 81300 or another incorrect number when done.
'Continues executing unrelated code using obj...
End Sub
在使用其他方法使增量线程安全的一些实验之后,我发现将SyncLock
中的增量包装在虚拟引用对象上可以得到预期的结果:
Private countObjects As Integer
Private locker As SomeType
Private Sub MyMainFunction()
locker = New SomeType()
Dim objects As List(Of MyObject)
'Query with Dapper, unrelevant to the problem.
Using connection As New System.Data.SqlClient.SqlConnection("aConnectionString")
objects = connection.Query("SELECT * FROM Foo") 'Returns around 81000 rows.
End Using
Parallel.ForEach(objects, Sub(u) MyParallelFunction(u))
Console.WriteLine(String.Format("Count : {0}", countObjects)) 'Prints "Count : 81000".
End Sub
Private Sub MyParallelFunction(obj As MyObject)
SyncLock locker
countObjects += 1 'Breakpoint Hit Count is 81000 when done.
End SyncLock
'Continues executing unrelated code using obj...
End Sub
为什么第一个代码段不能按预期工作?最令人困惑的是Breakpoint Hit Count会产生意想不到的结果。
我对Interlocked.Increment
或原子操作的理解是否有缺陷?我宁愿不在虚拟对象上使用SyncLock
,我希望有一种方法可以干净利落地完成。
更新
Debug
上以Any CPU
模式运行示例。ThreadPool.SetMaxThreads(60, 60)
upper,因为我在某个时候查询Access数据库。这会导致问题吗?Increment
是否会导致Parallel.ForEach
循环混乱,迫使它在所有任务完成之前退出?更新2(方法):
objects.Count
之前,我始终在断点上验证Parallel.ForEach
。Interlocked.Increment
替换为SyncLock locker
和countObjects += 1
。更新3
我通过在新的控制台应用程序中复制我的代码并替换外部类和代码来创建SSCCE。
这是控制台应用的Main
方法:
Sub Main()
Dim oClass1 As New Class1
oClass1.MyMainFunction()
End Sub
这是Class1
:
Imports System.Threading
Public Class Class1
Public Class Dummy
Public Sub New()
End Sub
End Class
Public Class MyObject
Public Property Id As Integer
Public Sub New(p_Id As Integer)
Id = p_Id
End Sub
End Class
Public Property countObjects As Integer
Private locker As Dummy
Public Sub MyMainFunction()
locker = New Dummy()
Dim objects As New List(Of MyObject)
For i As Integer = 1 To 81000
objects.Add(New MyObject(i))
Next
Parallel.ForEach(objects, Sub(u As MyObject)
MyParallelFunction(u)
End Sub)
Console.WriteLine(String.Format("Count : {0}", countObjects)) 'Interlock prints an incorrect count, different in each execution. SyncLock prints the correct count.
Console.ReadLine()
End Sub
'Interlocked
Private Sub MyParallelFunction(ByVal obj As MyObject)
Interlocked.Increment(countObjects)
End Sub
'SyncLock
'Private Sub MyParallelFunction(ByVal obj As MyObject)
' SyncLock locker
' countObjects += 1
' End SyncLock
'End Sub
End Class
将MyParallelFunction
从Interlocked.Increment
切换为SyncLock
时,我仍然注意到相同的行为。
答案 0 :(得分:9)
Interlocked.Increment
将永远被破坏。实际上,VB编译器将其重写为:
Value = <value from Property>
Interlocked.Increment(Value)
<Property> = Value
因此打破了Increment
提供的任何线程保证。将其更改为字段。 VB将把作为ByRef
参数传递的任何属性重写为类似于上面的代码。