Powershell匹配属性,然后有选择地组合对象以创建第三个

时间:2017-01-25 20:37:51

标签: arrays powershell object

我有一个解决方案,但我相信这不是最好的方法,因为它需要永远,所以我正在寻找更快/更好/更聪明的方式。

我从.csv文件中提取了多个pscustomObject对象。每个对象至少有一个共同属性。一个相对较小(对象中约200-300个项目/行)但另一个相当大(约60,000-100,000个项目)。一个人的内容可能与另一个人的内容相匹配也可能不匹配。

我需要找到两个对象在特定属性上匹配的位置,然后将每个对象的属性组合成一个具有全部或大多数属性的对象。

代码的一个示例代码段(不完全相同但为此应该有效 - 请参阅示例数据的图像): DataTables

Write-Verbose "Pulling basic Fruit data together"
$Purchase = import-csv "C:\Purchase.csv"
$Selling = import-csv "C:\Selling.csv"

Write-Verbose "Combining Fruit names and removing duplicates"
$Fruits = $Purchase.Fruit
$Fruits += $Selling.Fruit
$Fruits = $Fruits | Sort-Object -Unique

$compareData = @()

Foreach ($Fruit in $Fruits) {
        $IndResults = @()
        $IndResults = [pscustomobject]@{
        #Adding Purchase and Selling data
        Farmer = $Purchase.Where({$PSItem.Fruit -eq $Fruit}).Farmer
        Region = $Purchase.Where({$PSItem.Fruit -eq $Fruit}).Region
        Water = $Purchase.Where({$PSItem.Fruit -eq $Fruit}).Water
        Market = $Selling.Where({$PSItem.Fruit -eq $Fruit}).Market
        Cost = $Selling.Where({$PSItem.Fruit -eq $Fruit}).Cost
        Tax = $Selling.Where({$PSItem.Fruit -eq $Fruit}).Tax
        }
    Write-Verbose "Loading Individual results into response"
    $CompareData += $IndResults
}

Write-Output $CompareData

我认为这个问题与以下几行有关:

Farmer = $Purchase.Where({$PSItem.Fruit -eq $Fruit}).Farmer

如果我理解这一点,那么每次通过此行时都会查看$ Purchase对象。我正在寻找一种方法来加速整个过程,而不是让每次匹配尝试都能查看整个对象。

4 个答案:

答案 0 :(得分:1)

使用此Join-Object

$Purchase | Join $Selling -On Fruit | Format-Table

结果(使用Simon Catlin的数据):

Fruit      Farmer  Region     Water Market  Cost Tax
-----      ------  ------     ----- ------  ---- ---
Apple      Adam    Alabama    1     MarketA 10   0.1
Cherry     Charlie Cincinnati 2     MarketC 20   0.2
Damson     Daniel  Derby      3     MarketD 30   0.3
Elderberry Emma    Eastbourne 4     MarketE 40   0.4
Fig        Freda   Florida    5     MarketF 50   0.5

答案 1 :(得分:0)

在尝试将人力资源系统中的员工数据与AD林中的员工数据进行整合时,我遇到了这个问题。有数千行,这个过程需要一个年龄。

我最终离开了自定义对象并恢复到旧学校哈希表。

哈希表条目本身随后保存了一个包含数据的子哈希表。在您的实例中,外部哈希将在 $ fruit 上键入,子哈希包含各种属性,例如: farm region 等等。

哈希表相比之下闪电般快速。令人遗憾的是PowerShell在这方面很慢。

如果您需要更多信息,请大喊。

26/01示例代码......假设我正确理解了要求:

PURCHASE.CSV:

Fruit,Farmer,Region,Water
Apple,Adam,Alabama,1
Cherry,Charlie,Cincinnati,2
Damson,Daniel,Derby,3
Elderberry,Emma,Eastbourne,4 
Fig,Freda,Florida,5

SELLING.CSV

Fruit,Market,Cost,Tax
Apple,MarketA,10,0.1
Cherry,MarketC,20,0.2
Damson,MarketD,30,0.3
Elderberry,MarketE,40,0.4
Fig,MarketF,50,0.5

CODE

[String]       $Local:strPurchaseFile    = 'c:\temp\purchase.csv';
[String]       $Local:strSellingFile     = 'c:\temp\selling.csv';
[HashTable]    $Local:objFruitHash       = @{};
[System.Array] $Local:objSelectStringHit = $null;
[String]       $Local:strFruit           = '';

if ( (Test-Path -LiteralPath $strPurchaseFile -PathType Leaf) -and (Test-Path -LiteralPath $strSellingFile -PathType Leaf) ) {

    #
    # Populate data from purchase file.
    #
    foreach ( $objSelectStringHit in (Select-String -LiteralPath $strPurchaseFile -Pattern '^([^,]+),([^,]+),([^,]+),([^,]+)$' | Select-Object -Skip 1) ) {
        $objFruitHash[ $objSelectStringHit.Matches[0].Groups[1].Value ] = @{ 'Farmer' = $objSelectStringHit.Matches[0].Groups[2].Value;
                                                                             'Region' = $objSelectStringHit.Matches[0].Groups[3].Value;
                                                                             'Water'  = $objSelectStringHit.Matches[0].Groups[4].Value;
                                                                           };
        } #foreach-purchase-row

    #
    # Populate data from selling file.
    #
    foreach ( $objSelectStringHit in (Select-String -LiteralPath $strSellingFile -Pattern '^([^,]+),([^,]+),([^,]+),([^,]+)$' | Select-Object -Skip 1) ) {
        $objFruitHash[ $objSelectStringHit.Matches[0].Groups[1].Value ] += @{ 'Market' = $objSelectStringHit.Matches[0].Groups[2].Value;
                                                                              'Cost'   = [Convert]::ToDecimal( $objSelectStringHit.Matches[0].Groups[3].Value );
                                                                              'Tax'    = [Convert]::ToDecimal( $objSelectStringHit.Matches[0].Groups[4].Value );
                                                                            };
        } #foreach-selling-row

    #
    # Output data.  At this point, you could now build a PSCustomObject.
    #
    foreach ( $strFruit in ($objFruitHash.Keys | Sort-Object) ) {
        Write-Host -Object ( '{0,-15}{1,-15}{2,-15}{3,-10}{4,-10}{5,10:C}{6,10:P}' -f 
                                         $strFruit,
                                         $objFruitHash[$strFruit]['Farmer'],
                                         $objFruitHash[$strFruit]['Region'],
                                         $objFruitHash[$strFruit]['Water'],
                                         $objFruitHash[$strFruit]['Market'],
                                         $objFruitHash[$strFruit]['Cost'],
                                         $objFruitHash[$strFruit]['Tax']
                           );
        } #foreach

} else {
    Write-Error -Message 'File error.';
} #else-if

答案 2 :(得分:0)

使用Join-Object

http://ramblingcookiemonster.github.io/Join-Object/

Join-Object -Left $purchase -Right $selling -LeftJoinProperty fruit -RightJoinProperty fruit -Type OnlyIfInBoth | ft

答案 3 :(得分:0)

我需要自己做类似的事情。我想获取两个系统数组对象,并进行比较,以将匹配结果拉出,而不必每次都操纵输入数据。这是我使用的方法,尽管我很欣赏这种方法效率低下,但是对于我必须使用的200条左右记录来说,这是瞬时的。

我试图将正在做的事情(用户及其新旧目录)转换为农民,水果和市场等,所以我希望这是有道理的!

$Purchase = import-csv "C:\Purchase.csv"
$Selling = import-csv "C:\Selling.csv"

$compareData = @()
foreach ($iPurch in $Purchase) {
    foreach ($iSell in $Selling) {
        if ($iPurch.fruit -match $iSell.fruit) {
            write-host "Match found between $($iPurch.Fruit) and $($iSell.Fruit)"
            $hash = @{
                Fruit           =   $iPurch.Fruit
                Farmer          =   $iPurch.Farmer
                Region          =   $iPurch.Region
                Water           =   $iPurch.Water
                Market          =   $iSell.Market
                Cost            =   $iSell.Cost
                Tax             =   $iSell.Tax
            }
            $Build = New-Object PSObject -Property $hash
            $Total = $Total + 1
            $compareData += $Build
            }
        }
    }
Write-Host "Processed $Total records"