Question

我正在从网络服务器（特别是Fanuc控制器）解析HTML并将innerText分配给对象。

#Make sure the controller respons
if ($webBody.StatusCode -eq 200) {
    Write-Host "Response is Good!" -ForegroundColor DarkGreen
    $preBody = $webBody.ParsedHtml.body.getElementsByTagName('PRE') | Select -ExpandProperty innerText
    $preBody
}

输出看起来有点像：

  [1-184 above]
  [185] = 0  ''   
  [186] = 0  ''   
  [187] = 0  ''   
  [188] = 0  ''   
  [189] = 0  '' 
  [and so on]

我只想读取190,191,193中的数据。最好的方法是什么？我正努力消除对象中不需要的数据。

目前我有一个vbscript应用程序输出到txt文件，清理数据然后读回来并将其操作到sql插件。我试图通过PowerShell改进它，并且如果可能的话，尽量保持程序中的所有内容。

非常感谢任何帮助。

Answer 1

假设数据集不是太大而无法将所有内容放入内存中。您可以使用正则表达式解析为PowerShell对象，然后可以使用Where-Object进行过滤。

#Regex with a capture group for each important value
$RegEx = "\[(.*)\]\s=\s(\d+)\s+'(.*)'"
$IndexesToMatch = @(190, 191, 193)
$ParsedValues = $prebody.trim | ForEach-Object {
    [PSCustomObject]@{
        index = $_ -replace $regex,'$1'
        int = $_ -replace $regex,'$2'
        string = $_ -replace $regex,'$3'
    } 
}
$ParsedValues | Where-Object { $_.index -in $IndexesToMatch }

输入：

[190] = 1  'a'
[191] = 2  'b'
[192] = 3  'c'
[193] = 4  'd'
[194] = 5  'e'

输出：

index int string
----- --- ------
190   1   a
191   2   b
193   4   d

解析和修改powershell对象

1 个答案: