VB.NET - Parallel.For比Sequential For慢

时间:2017-05-11 19:10:46

标签: vb.net multithreading parallel-processing parallel.for

我正在尝试使用并行处理,以根据内容分离数据。

在下面的示例中,我生成随机数,如果满足条件,我想将它们存储到数据表中。

令我失望的是顺序比平行更快。

是否可以让工作更快并行?

Imports System.Random
Imports System.Threading
Imports System.Threading.Tasks

Public Class Form1
    Public No As Integer = 5
    Public DT(No) As DataTable
    Public S(No) As String
    Public StartTimer As DateTime
    Private Sub ParrallelProc_Btn_Click(sender As Object, e As EventArgs) Handles ParrallelProc_Btn.Click
        For j = 1 To No
            DT(j).Rows.Clear()
        Next
        StartTimer = Now
        For k = 1 To 10000
            Parallel.For(1, No + 1, Sub(i)
                                        Dim CurrentNo As String = CStr(Math.Round(Rnd() * 1000000, 0))
                                        If CurrentNo.Contains(S(i)) Then DT(i).Rows.Add(CurrentNo)
                                    End Sub)
        Next
        Dim Interval = Now.Subtract(StartTimer).TotalSeconds
    End Sub

    Private Sub SequentialProc_Btn_Click(sender As Object, e As EventArgs) Handles SequentialProc_Btn.Click
        For j = 1 To No
            DT(j).Rows.Clear()
        Next
        StartTimer = Now
        For k = 1 To 10000
            For l = 1 To No
                Dim CurrentNo As String = CStr(Math.Round(Rnd() * 1000000, 0))
                If CurrentNo.Contains(S(l)) Then DT(l).Rows.Add(CurrentNo)
            Next
        Next
        Dim Interval = Now.Subtract(StartTimer).TotalSeconds
    End Sub
End Class

2 个答案:

答案 0 :(得分:0)

首先,不要吹嘘,但我的电脑在160毫秒内运行并行,并在40毫秒内顺序运行。

创建线程有一些开销,只有5个线程是不必要的 - 你可能只做5件事。特别是像你一样轻巧的东西。并行化是为了同时执行多个长时间运行的任务。

最终,一旦您克服了线程开销,并行循环就会更快。我已经通过增加No进行了测试,这种情况发生在100左右。

Public No As Integer = 100
Public DT(No) As DataTable
Public S(No) As String
Public StartTimer As DateTime
Private iterations As Integer = 10000

Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
    For i = 1 To No
        DT(i) = New DataTable()
        DT(i).Columns.Add()
        S(i) = (i + 1).ToString()
    Next
End Sub

Private Sub ParallelProc_Btn_Click(sender As Object, e As EventArgs) Handles ParallelProc_Btn.Click
    clearDT()
    Dim sw As New Stopwatch()
    sw.Start()
    For k = 1 To iterations
        Parallel.For(
            1,
            No + 1,
            AddressOf process)
    Next
    sw.Stop()
    MessageBox.Show(sw.ElapsedMilliseconds)
End Sub

Private Sub SequentialProc_Btn_Click(sender As Object, e As EventArgs) Handles SequentialProc_Btn.Click
    clearDT()
    Dim sw As New Stopwatch()
    sw.Start()
    For k = 1 To iterations
        For i = 1 To No
            process(i)
        Next
    Next
    MessageBox.Show(sw.ElapsedMilliseconds)
End Sub

Private Sub clearDT()
    For j = 1 To No
        DT(j).Rows.Clear()
    Next
End Sub

Private Sub process(i As Integer)
    Randomize()
    Dim CurrentNo As String = CStr(Math.Round(Rnd() * 1000000, 0))
    If CurrentNo.Contains(S(i)) Then DT(i).Rows.Add(CurrentNo)
End Sub

我还将操作移动到Sub,这两个例程都可以调用它。重用代码不仅可以节省时间和空间,还可以确保只是比较方法,而不是例程。

在使用Randomize()之前,您还应该致电Rnd()。见https://msdn.microsoft.com/en-us/library/y66ey2hh(v=vs.110).aspx

更好的测试是在process()方法中添加一些实质内容,例如Thread.Sleep(1),并使用Noiterations。你会发现平行睡觉比按顺序睡觉要好得多。

答案 1 :(得分:0)

将较小的循环放在较大的循环中,它应该使并行循环比顺序循环快得多。

#Transform the kind column to free or occupied only
df.kind = df.kind.replace('[^P]','free',regex=True).replace('P','occupied')
#Convert kind from long to wide columns
df = pd.get_dummies(df,columns=['kind'],prefix='',prefix_sep='')
#get total
df['total']=df.free+df.occupied
#groupby and sum
df.groupby(['date','sector']).sum()
Out[322]: 
                   free  occupied  total
date       sector                       
2017-02-01 A          2         2      4
           B          2         2      4
2017-02-02 A          2         2      4
           B          3         1      4