Foreach 对象比 foreach -parallel 更快?

时间:2021-01-27 19:47:07

标签: multithreading powershell foreach parallel-processing

我刚刚开始使用 powershell,我想知道为什么我的并行 srcipt 比正常的 foreach 对象脚本更慢?

我的普通 foreachobject 脚本:

function Get-ADUsers {  #get all users in nested groups }

 function Get-NestedGroupUsers {
    param ( 
        [Parameter(Mandatory = $true)][String]$FileName,
        [Parameter(Mandatory = $true)][String]$searchFileURL
    )
    $storageHolder = @()
    # $storageHolder | Export-Csv -Path "C:\Users\demandx\Desktop\AD User Lists\$FileName.csv" -NoTypeInformation -Force 
    $groupList = Get-Content $searchFileURL 
    $groupList |  ForEach-Object { 
        $allusers = Get-ADUsers -GroupName $_
        $storageHolder += $allusers  
       
    }
    $storageHolder | select ParentGroup, Name, EmployeeNumber, Enabled, LastLogonDate, PasswordLastSet  |Export-Csv -Path "C:\Users\demandx\Desktop\$FileName.csv" -NoTypeInformation -Force
}

我的 foreach -parallel 脚本(我将函数存储在 psm1 中,然后在此处导入。)

Function Get-Members {
    param ( 
        [Parameter(Mandatory = $true)][String]$FileName,
        [Parameter(Mandatory = $true)][String]$searchFileURL
    )
    $groupList = Get-Content $searchFileURL 
    $storageHolder = $groupList |  ForEach-Object -Parallel {
        Import-Module -Name "C:\Users\demandx\Desktop\Get-ADUserMembers.psm1" 
        Get-ADUserMembers -GroupName $_ | Select-Object ParentGroup, Name, EmployeeNumber, Enabled, LastLogonDate, PasswordLastSet
    }   -ThrottleLimit 5
    
    $storageHolder |  Export-Csv -Path "C:\Users\demandx\Desktop\AD User Lists\$FileName.csv" -NoTypeInformation -Force

}

脚本或我的 get-adusers(获取嵌套组中的所有用户)

 function Get-ADUsers { 
    param ( 
        [Parameter(ValuefromPipeline = $true, mandatory = $true)][String] $GroupName
    ) 
    [int]$circular = $null

    # result holder
    $resultHolder = @()
        $table = $null 
        $nestedmembers = $null 
        $adgroupname = $null     
  function Get-ADUsers { 
    param ( 
        [Parameter(ValuefromPipeline = $true, mandatory = $true)][String] $GroupName
    ) 
    [int]$circular = $null

    # result holder
    $resultHolder = @()
        $table = $null 
        $nestedmembers = $null 
        $adgroupname = $null     

        # get members of the group and member of
        $ADGroupname = get-adgroup $groupname -properties memberof, members

        # list all members as list (no headers) and save to var
        $memberof = $adgroupname | select -expand memberof 
       
        if ($adgroupname) {  
            if ($circular) { 
                $nestedMembers = Get-ADGroupMember -Identity $GroupName -recursive 
                $circular = $null 
            } 
            else { 
                $nestedMembers = Get-ADGroupMember -Identity $GroupName | sort objectclass -Descending
                # if get adgroupmember returns nothing, it uses the members for ordinary getADGroup
                if (!($nestedmembers)) {
                    $unknown = $ADGroupname | select -expand members
                    if ($unknown) {
                        $nestedmembers = @()
                        foreach ($member in $unknown) {
                            $nestedmembers += get-adobject $member
                        }
                    }
                }
            } 
            # loops through each member
            ForEach($nestedmember in $nestedmembers){ 
                # creates the properties into a custom object. 
                $Props = @{
                    Type            = $nestedmember.objectclass;
                    Name            = $nestedmember.name;
                    DisplayName     = "";
                    ParentGroup     = $ADgroupname.name;
                    Enabled         = "";
                    Nesting         = $nesting;
                    DN              = $nestedmember.distinguishedname;
                    Comment         = ""
                    EmployeeNumber  = "";
                    LastLogonDate   = "";
                    PasswordLastSet = "";
                } 
                # if member object is a user
                if ($nestedmember.objectclass -eq "user") { 
                    # saves all the properties in the table. 
                    $nestedADMember = get-aduser $nestedmember.Name -properties enabled, displayname, EmployeeNumber, LastLogonDate, PasswordLastSet
                    $table = new-object psobject -property $props 
                    $table.enabled = $nestedadmember.enabled
                    $table.name = $nestedadmember.samaccountname
                    $table.displayname = $nestedadmember.displayname
                    $table.EmployeeNumber = $nestedadmember.EmployeeNumber
                    $table.LastLogonDate = $nestedadmember.LastLogonDate
                    $table.PasswordLastSet = $nestedadmember.PasswordLastSet

                    #save all in 1 storage
                    $resultHOlder += $table | select type, name, displayname, parentgroup, nesting, enabled, dn, comment , EmployeeNumber, LastLogonDate, PasswordLastSet
                } 


                # if member object is group
                elseif ($nestedmember.objectclass -eq "group") {  
                    $table = new-object psobject -Property $props 
                    # if circular, meaning the groups member of list contains one of its members. 
                    # e.g. if group 2 is a member of group 1 and group 1 is a member of grou 2
                    if ($memberof -contains $nestedmember.distinguishedname) { 
                        $table.comment = "Circular membership" 
                        $circular = 1 
                    } 
                    # for circular output
                    #$table | select type, name, displayname, parentgroup, nesting, enabled, dn, comment 
                    #calling function itself
                    $resultHOlder += Get-ADUsers -GroupName $nestedmember.distinguishedName                             
                } 
                else { 
                    if ($nestedmember) {
                        $table = new-object psobject -property $props
                        $resultHolder += $table | select type, name, displayname, parentgroup, nesting, enabled, dn, comment, EmployeeNumber, LastLogonDate, PasswordLastSet
                    }
                } 
            } 
        } 



    return   $resultHOlder

}
function Get-NestedGroupUsers {
    param ( 
        [Parameter(Mandatory = $true)][String]$FileName,
        [Parameter(Mandatory = $true)][String]$searchFileURL
    )
    $storageHolder = @()
    # $storageHolder | Export-Csv -Path "C:\Users\demandx\Desktop\AD User Lists\$FileName.csv" -NoTypeInformation -Force 
    $groupList = Get-Content $searchFileURL #| ForEach-Object { $_ }
    $groupList |  ForEach-Object { 
        $allusers = Get-ADUsers -GroupName $_
        $storageHolder += $allusers  
       
    }
    $storageHolder | select ParentGroup, Name, EmployeeNumber, Enabled, LastLogonDate, PasswordLastSet  |Export-Csv -Path "C:\Users\demandx\Desktop\$FileName.csv" -NoTypeInformation -Force
}



        # get members of the group and member of
        $ADGroupname = get-adgroup $groupname -properties memberof, members

        # list all members as list (no headers) and save to var
        $memberof = $adgroupname | select -expand memberof 
       
        if ($adgroupname) {  
            if ($circular) { 
                $nestedMembers = Get-ADGroupMember -Identity $GroupName -recursive 
                $circular = $null 
            } 
            else { 
                $nestedMembers = Get-ADGroupMember -Identity $GroupName | sort objectclass -Descending
                # if get adgroupmember returns nothing, it uses the members for ordinary getADGroup
                if (!($nestedmembers)) {
                    $unknown = $ADGroupname | select -expand members
                    if ($unknown) {
                        $nestedmembers = @()
                        foreach ($member in $unknown) {
                            $nestedmembers += get-adobject $member
                        }
                    }
                }
            } 
            # loops through each member
            ForEach($nestedmember in $nestedmembers){ 
                # creates the properties into a custom object. 
                $Props = @{
                    Type            = $nestedmember.objectclass;
                    Name            = $nestedmember.name;
                    DisplayName     = "";
                    ParentGroup     = $ADgroupname.name;
                    Enabled         = "";
                    Nesting         = $nesting;
                    DN              = $nestedmember.distinguishedname;
                    Comment         = ""
                    EmployeeNumber  = "";
                    LastLogonDate   = "";
                    PasswordLastSet = "";
                } 
                # if member object is a user
                if ($nestedmember.objectclass -eq "user") { 
                    # saves all the properties in the table. 
                    $nestedADMember = get-aduser $nestedmember.Name -properties enabled, displayname, EmployeeNumber, LastLogonDate, PasswordLastSet
                    $table = new-object psobject -property $props 
                    $table.enabled = $nestedadmember.enabled
                    $table.name = $nestedadmember.samaccountname
                    $table.displayname = $nestedadmember.displayname
                    $table.EmployeeNumber = $nestedadmember.EmployeeNumber
                    $table.LastLogonDate = $nestedadmember.LastLogonDate
                    $table.PasswordLastSet = $nestedadmember.PasswordLastSet

                    #save all in 1 storage
                    $resultHOlder += $table | select type, name, displayname, parentgroup, nesting, enabled, dn, comment , EmployeeNumber, LastLogonDate, PasswordLastSet
                } 


                # if member object is group
                elseif ($nestedmember.objectclass -eq "group") {  
                    $table = new-object psobject -Property $props 
                    # if circular, meaning the groups member of list contains one of its members. 
                    # e.g. if group 2 is a member of group 1 and group 1 is a member of grou 2
                    if ($memberof -contains $nestedmember.distinguishedname) { 
                        $table.comment = "Circular membership" 
                        $circular = 1 
                    } 
                    # for circular output
                    #$table | select type, name, displayname, parentgroup, nesting, enabled, dn, comment 
                    #calling function itself
                    $resultHOlder += Get-ADUsers -GroupName $nestedmember.distinguishedName                             
                } 
                else { 
                    if ($nestedmember) {
                        $table = new-object psobject -property $props
                        $resultHolder += $table | select type, name, displayname, parentgroup, nesting, enabled, dn, comment, EmployeeNumber, LastLogonDate, PasswordLastSet
                    }
                } 
            } 
        } 



    return   $resultHOlder

}

平行结果

-------------------------------------------
Days              : 0
Hours             : 0
Minutes           : 1
Seconds           : 2
Milliseconds      : 283
Ticks             : 622833415
TotalDays         : 0.000720872008101852
TotalHours        : 0.0173009281944444
TotalMinutes      : 1.03805569166667
TotalSeconds      : 62.2833415
TotalMilliseconds : 62283.3415

非并行结果

-------------------------------------------
Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 35
Milliseconds      : 322
Ticks             : 353221537
TotalDays         : 0.00040882122337963
TotalHours        : 0.00981170936111111
TotalMinutes      : 0.588702561666667
TotalSeconds      : 35.3221537
TotalMilliseconds : 35322.1537

1 个答案:

答案 0 :(得分:3)

TLDR:

有3个原因:

  1. 要充分利用 ForEach-Object -Parallel 性能,脚本块的处理时间需要明显大于设置线程和环境的时间。
  2. Import-Module 会引入开销。
  3. 这两个因数各自都很小,但乘以 1000 或更大的数,它们就会变大。

ForEach-Object -Parallel 的运行方式与普通的 ForEach-Object 非常不同。

首先,一个普通的 ForEach-Object 在您当前的 PowerShell 线程中运行,可以访问所有变量、加载的内存和流水线。这对于我们运行的所有作业的 98% 都很好,并且 1 秒的执行时间是可以的。在 2% 的情况下,我们有一个超级 CPU 密集型进程,它最大限度地使用单个 CPU 内核并永远运行,或者我们需要在其他执行可以发生时等待响应(例如 API 请求),然后 {{1 }} 是我们需要看的。

并行执行背后的想法是利用您全新的 AMD 锐龙™ Threadripper™ 3990X 和 64 核/128 线程,并将您的进程拆分为单独的“作业”,这些作业可以在多个 CPU 内核和多个线程上运行同时。这可以将您的速度提高几个数量级,例如可能快 128 倍。

为了实现这一点,-Parallel 为您执行的每个脚本块创建一个新的“作业”,并开始在 CPU 内核之间传播作业以执行。当您长时间运行受 CPU 限制的进程时,这很好,但是当您有非常短小的作业时,您就遇到了并行执行的关键,其中设置比实际执行花费的时间更多。 ForEach-Object -Parallel 必须为您运行的每个“作业”完全设置您的环境,例如它必须为每个要运行的作业启动多个新线程和多个新 PowerShell 实例。

为了说明所需的设置时间,如果我们向当前线程写入一次“Hello World”需要 1 毫秒:

ForEach-Object -Parallel

并行运行 1 个“Hello World”需要 26 毫秒:

PS C:\> Measure-Command { Write-Host "Hello World" }
Hello World

Seconds           : 0
Milliseconds      : 1
TotalMilliseconds : 1.9798

这意味着它花费了大约 25 毫秒来启动一个新线程,并设置环境和 1 毫秒的实际工作。

在当前运行的线程上写入 100 次大约需要 83 毫秒:

PS C:\> Measure-Command { 1 | ForEach-Object -Parallel { Write-Host "Hello World" } }
Hello World

Seconds           : 0
Milliseconds      : 26
TotalMilliseconds : 26.052

使用 PS C:\> Measure-Command { 1..100 | ForEach-Object { Write-Host "Hello World" } } Hello World ... Hello World Hello World Seconds : 0 Milliseconds : 83 TotalMilliseconds : 83.1846 -Parallel 中运行需要 294 毫秒:

-ThrottleLimit 5

这表明并行运行对微小的单个操作是如何不利的。但另一方面,如果你有一个需要 1 秒才能运行的东西,你可以开始看看它是如何工作得更好的:

例如运行 5 个进程,每个进程需要 1 秒。首先在单个线程上:

PS C:\> Measure-Command { 1..100 | ForEach-Object -Parallel { Write-Host "Hello World" }  -ThrottleLimit 5 }
Hello World
...
Hello World
Hello World

Seconds           : 0
Milliseconds      : 294
TotalMilliseconds : 294.3205

正如预期的那样,它只需要 5 秒多一点。现在,并行:

PS C:\> Measure-Command { 1..5 | ForEach-Object { Start-Sleep -Seconds 1 } }

Seconds           : 5
Milliseconds      : 46
TotalSeconds      : 5.046348
TotalMilliseconds : 5046.348

它在一秒钟内完成。如果处理时间花费的时间明显多于设置时间,则 PS C:\> Measure-Command { 1..5 | ForEach-Object -Parallel { Start-Sleep -Seconds 1 } -ThrottleLimit 5 } Seconds : 1 Milliseconds : 73 TotalSeconds : 1.0732423 TotalMilliseconds : 1073.2423 很有用。

此外,在您的情况下,您不仅有额外的设置时间开销,而且加载模块(需要设置新环境)也会显着增加 -Parallel 版本的时间。

>

例如,让我们在 ForEach-Object -Parallel 脚本中导入一个模块 AzureAD inside 5 次:

ForEach-Object

现在有了PS C:\> Measure-Command { 1..5 | ForEach-Object { Import-Module AzureAD } } Seconds : 0 Milliseconds : 18 TotalSeconds : 0.0185406 TotalMilliseconds : 18.5406

ForEach-Object -Parallel

我们可以看到有一个显着的区别,因为它必须加载模块 5 次,而不是在线程内只加载一次,然后注意到它仍然加载,而不是重新加载它。