将PowerShell中的数组与文件目录效率低下进行比较

时间:2019-01-23 19:35:02

标签: arrays powershell loops

我对PS并不完全陌生,但是我会在没有指导的情况下将其全部投入工作,这会使学习变得困难,因为有些事情我不知道,我不知道。话虽这么说,但我有一个功能完整的脚本,希望获得如何更快或更可能获得完全不同的方法的指针。

问题: 我在文件夹中有一堆照片,需要将这些照片与SQL表中的ID列表进行比较。如果找到匹配项,则需要将该照片移到另一个目录中进行处理

我知道这是低效率的,因为我在ID循环的每个迭代中都调用目录,但是如果我尝试还创建文件数组来检查然后比较两个数组,则无法使目录工作。

同样,该功能很棒,只是很快而已。

CLS

$startDate = Get-Date
$startDate

$photoSourceLocation = "C:\Temp\Photos\Aggregate" #"C:\Temp\Photos\Moved"
$photoDropLocation = "C:\Temp\Photos\Process"

$IdQuery = "SELECT ST.Id FROM SomeTable as ST"


$patients = Invoke-Sqlcmd -Query $IdQuery -ServerInstance "SQLServer" -Database "DB"
$photos = GCI $photoSourceLocation -File

$patientRaw = $patients | Measure 
$patientCount = $patientRaw.Count

#$PatientCount
$I = 1
$Out= ""

forEach ($patient in $Patients)
        {
        $MovePhoto = GCI $photoSourceLocation -File | Where-Object {$_.BaseName -contains $patient.Id}
        if($MovePhoto)
        {
        Move-Item $movePhoto.FullName -Destination $photoDropLocation 
        }
        #$MovePhoto.FullName
        Write-Progress -Activity "Processing photo matches." -Status "Progress:" -PercentComplete ($I/$patientRaw.count*100) #-End $Out
        $I++        
        }

$endDate = Get-Date     
$endDate

2 个答案:

答案 0 :(得分:0)

在每次迭代中搜索文件可能会使它变得如此缓慢。您应该只获取一次文件(请查看评论):

未测试

cls
($startDate = Get-Date)

$photoSourceLocation = "C:\Temp\Photos\Aggregate"
$photoDropLocation = "C:\Temp\Photos\Process"

$idQuery = "SELECT Id FROM SomeTable"
# select ids as plain string array (makes it easier for "contains")
$patientIds = @(Invoke-Sqlcmd -Query $idQuery -ServerInstance "SQLServer" -Database "DB" | foreach { $_.Id.ToString() })

# get all photos only once
# make it a list so we can remove items
$photos = [System.Collections.ArrayList]@(gci $photoSourceLocation -File)
$i = 0
foreach ($id in $patientIds) {
    # check the list of photos
    for ($p = 0; $p -lt $photos.Count; $p++) {
        $photos = $photos[$p]
        if ($photo.BaseName.Contains($id)) {
            # match found: move photo
            Move-Item $photo.FullName -Destination $photoDropLocation 
            # remove from list, so we have to search less next time
            $photos.RemoveAt($p)
            $p--
            # if there can only be one photo, we could stop looking here:
            # break
        }
    }
    $i++
    Write-Progress -Activity "Processing photo matches." -Status "Progress:" -PercentComplete ($i * 100 / $patientIds.Count)
}

($endDate = Get-Date)

注意:

  • 如果您应该在外部循环遍历id,在内部循环遍历照片,或者反之,则取决于您的用例。
  • 如果每个ID只能有一张照片,请取消注释“中断”
  • 从列表中删除元素会使其更短,并且需要检查的项目更少。但这也需要调整数组的大小。测试它并比较两个选择,也许删除项目实际上更快。

答案 1 :(得分:0)

这是一种测试CSV导入中的项目与文件名集合中的项目之间是否匹配的方法。这取决于-match用于集合而不是单个项目时的工作方式。

# fake reading in a CSV file
#    in real life, use Import-CSV
$InStuff = @'
ID
Zero000
Alpha111
Bravo222
Charlie333
Santa666
'@ | ConvertFrom-Csv

# fake reading in a list of file names
#    in real life, use Get-ChildItem
$FileList = @(
    [System.IO.FileInfo]'Alpha111.jpg'
    [System.IO.FileInfo]'Charlie333.jpg'
    [System.IO.FileInfo]'Santa666.jpg'
    )

foreach ($IS_Item in $InStuff)
    {
    # this depends on there only being ONE matching file
    $FoundFile = $FileList -match $IS_Item.ID

    if ($FoundFile)
        {
        'Found a file for     = {0}' -f $IS_Item.ID
        '    The file name is = {0}' -f $FoundFile.Name
        }
        else
        {
        Write-Warning ('No matching file was found for {0}' -f $IS_Item.ID)
        }
    }

输出...

WARNING: No matching file was found for Zero000
Found a file for     = Alpha111
    The file name is = Alpha111.jpg
WARNING: No matching file was found for Bravo222
Found a file for     = Charlie333
    The file name is = Charlie333.jpg
Found a file for     = Santa666
    The file name is = Santa666.jpg