将分段周期分隔的节点转换为JSON对象

时间:2016-10-26 17:04:49

标签: json powershell

我有很多字符串条目(这是命名空间/类树),如下所示:

appsystem
appsystem.applications
appsystem.applications.APPactivities
appsystem.applications.APPmanager
appsystem.applications.APPmodels
appsystem.applications.MAPmanager
appsystem.applications.MAPmanager.maphub
appsystem.applications.MAPmanager.mapmanager
appsystem.applications.pagealertsmanager
appsystem.authentication
appsystem.authentication.manager
appsystem.authentication.manager.encryptionmanager
appsystem.authentication.manager.sso
appsystem.authentication.manager.tokenmanager

但是,我需要最终输出如下:

{
    "name": "appsystem",
    "children": [
        {
        "name": "applications",
        "children": [
            {"name": "APPactivities"},
            {"name": "APPmanager"},
            {"name": "APPmodels"},
            {"name": "MAPmanager",
                "children": [
                    {"name": "maphub"},
                    {"name": "mapmanager"}

                ]},
            {"name": "pagealertsmanager"}   
            ]
        },
        {
        "name": "authentication",
        "children": [
            {"name": "manager",
                "children": [
                    {"name": "encryptionmanager"},
                    {"name": "sso"},
                    {"name": "tokenmanager"}
                ]}
            ]
        }
    ]
}

总节点可以是任意数字。

我假设我需要递归,但我甚至无法开始。

2 个答案:

答案 0 :(得分:3)

这构建了嵌套列表,PowerShell ConvertTo-JSON展平了外部列表。

您可以将$Line in $s更改为$line in (Get-Content input.txt)

但我认为这样做:

$s = @'
appsystem
appsystem.applications
appsystem.applications.APPactivities
appsystem.applications.APPmanager
appsystem.applications.APPmodels
appsystem.applications.MAPmanager
appsystem.applications.MAPmanager.maphub
appsystem.applications.MAPmanager.mapmanager
appsystem.applications.pagealertsmanager
appsystem.authentication
appsystem.authentication.manager
appsystem.authentication.manager.encryptionmanager
appsystem.authentication.manager.sso
appsystem.authentication.manager.tokenmanager
'@ -split "`r`n"

$TreeRoot = New-Object System.Collections.ArrayList

foreach ($Line in $s) {

    $CurrentDepth = $TreeRoot

    $RemainingChunks = $Line.Split('.')
    while ($RemainingChunks)
    {

        # If there is a dictionary at this depth then use it, otherwise create one.
        $Item = $CurrentDepth | Where-Object {$_.name -eq $RemainingChunks[0]}
        if (-not $Item)
        {
            $Item = @{name=$RemainingChunks[0]}
            $null = $CurrentDepth.Add($Item)
        }

        # If there will be child nodes, look for a 'children' node, or create one.
        if ($RemainingChunks.Count -gt 1)
        {
            if (-not $Item.ContainsKey('children'))
            {
                $Item['children'] = New-Object System.Collections.ArrayList
            }

            $CurrentDepth = $Item['children']
        }

        $RemainingChunks = $RemainingChunks[1..$RemainingChunks.Count]
    }
}

$TreeRoot | ConvertTo-Json -Depth 1000

编辑:这太慢了?我尝试了一些random pausing分析,发现(并不太令人惊讶)它是内部嵌套循环,它搜索children数组以匹配子节点,这些子节点被击中太多次了。

这是一个经过重新设计的版本,它仍然构建了树,这次它还构建了树的快捷方式的TreeMap哈希表到所有以前构建的节点,所以它也可以向右跳,而不是搜索{{1为他们列出。

我制作了一个测试文件,大约20k随机行。原始代码在108秒内处理它,这个在1.5秒内完成并且输出匹配。

children

(@ mklement0'代码需要103秒并产生一个完全不同的输出--JSON的5.4M字符而不是JSON的10.1M字符。[编辑:因为我的代码允许列表中的多个根节点我的测试文件有,他们的代码不允许])

自我生成的PS帮助链接来自我的代码块(如果有):

  • New-Object(在模块$TreeRoot = New-Object System.Collections.ArrayList $TreeMap = @{} foreach ($line in (Get-Content d:\out.txt)) { $_ = ".$line" # easier if the lines start with a dot if ($TreeMap.ContainsKey($_)) # Skip duplicate lines { continue } # build a subtree from the right. a.b.c.d.e -> e then d->e then c->d->e # keep going until base 'a.b' reduces to something already in the tree, connect new bit to that. $LineSubTree = $null $TreeConnectionPoint = $null do { $lastDotPos = $_.LastIndexOf('.') $leaf = $_.Substring($lastDotPos + 1) $_ = $_.Substring(0, $lastDotPos) # push the leaf on top of the growing subtree $LineSubTree = if ($LineSubTree) { @{"name"=$leaf; "children"=([System.Collections.ArrayList]@($LineSubTree))} } else { @{"name"=$leaf} } $TreeMap["$_.$leaf"] = $LineSubTree } while (!($TreeConnectionPoint = $TreeMap[$_]) -and $_) # Now we have a branch built to connect in to the existing tree # but is there somewhere to put it? if ($TreeConnectionPoint) { if ($TreeConnectionPoint.ContainsKey('children')) { $null = $TreeConnectionPoint['children'].Add($LineSubTree) } else { $TreeConnectionPoint['children'] = [System.Collections.ArrayList]@($LineSubTree) } } else { # nowhere to put it, this is a new root level connection $null = $TreeRoot.Add($LineSubTree) } } $TreeRoot | ConvertTo-Json -Depth 100 中)
  • Get-Content(在模块Microsoft.PowerShell.Utility中)
  • ConvertTo-Json(在模块Microsoft.PowerShell.Management中)

答案 1 :(得分:2)

使用递归函数的替代实现来补充TessellatingHeckler's great answer

重点是模块性和简洁性,而不是表现。 [1]

# Outer function that loops over all paths and builds up a one or more nested
# hashtables reflecting the path hierarchy, which are converted to JSON on output.
# Note that only a single JSON object is output if all paths share the same root
# component; otherwise, a JSON *array* is output.
function convert-PathsToNestedJsonObject([string[]] $paths) {
  $hts = New-Object Collections.ArrayList
  $paths.ForEach({
    $rootName = $_.split('.')[0] 
    $ht = $hts.Where({ $_.name -eq $rootName }, 'First')[0]
    if (-not $ht) { [void] $hts.Add(($ht = @{})) }
    convert-PathToNestedHashtable $ht $_ 
  })
  $hts | ConvertTo-Json -Depth 100
}

# Recursive helper function that takes a path such as "appsystem.applications"
# and converts it into a nested hashtable with keys "name" and "children" to
# reflect the path hierarchy. 
function convert-PathToNestedHashtable([hashtable] $ht, [string] $path) {
  $name, $childName, $rest = $path -split '\.', 3
  $ht.name = $name
  if ($childName) {
    if ($ht.children) { 
      $htChild = $ht.children.Where({ $_.name -eq $childName }, 'First')[0]
    } else {
      $ht.children = New-Object Collections.ArrayList
      $htChild = $null
    }
    if (-not $htChild) {      
      [void] $ht.children.Add(($htChild = @{}))
    }
    convert-PathToNestedHashtable $htChild "$childName.$rest" 
  }
}

# Call the outer function with the input paths (assumed to be stored in $paths).
convert-PathsToNestedJsonObject $paths

[1] 应用了一种故意类型的优化,然而,这仍然保持代码简洁:

PSv4 +提供(鲜为人知的)扩展方法 .ForEach() and .Where(),它们不仅明显快于其cmdlet对应ForEach-ObjectWhere-Object,而且还提供其他功能。

具体做法是:

  • $paths.ForEach({ ... })用于代替
    $paths | ForEach-Object { ... }

  • $ht.children.Where({ $_.name -eq $childName }, 'First')[0]用于代替
    $ht.children | Where-Object { $_.name -eq $childName } | Select-Object -First 1