很抱歉对PowerShell的了解有限。在这里,我尝试从网站上读取html内容,并输出为csv文件。现在我可以使用我的powershell脚本成功下载整个html代码:
$url = "http://cloudmonitor.ca.com/en/ping.php?vtt=1392966369&varghost=www.yahoo.com&vhost=_&vaction=ping&ping=start";
$Path = "$env:userprofile\Desktop\test.txt"
$ie = New-Object -com InternetExplorer.Application
$ie.visible = $true
$ie.navigate($url)
while($ie.ReadyState -ne 4) { start-sleep -s 10 }
#$ie.Document.Body.InnerText | Out-File -FilePath $Path
$ie.Document.Body | Out-File -FilePath $Path
$ie.Quit()
获取HTML代码,如下所示:
........
<tr class="light-grey-bg">
<td class="right-dotted-border">Stockholm, Sweden (sesto01):</td>
<td class="right-dotted-border"><span id="cp20">Okay</span>
</td>
<td class="right-dotted-border"><span id="minrtt20">21.8</span>
</td>
<td class="right-dotted-border"><span id="avgrtt20">21.8</span>
</td>
<td class="right-dotted-border"><span id="maxrtt20">21.9</span>
</td>
<td><span id="ip20">2a00:1288:f00e:1fe::3001</span>
</td>
</tr>
........
但我真正想要的是将内容输出到csv文件,如下所示:
Stockholm Sweden (sesto01),Okay,21.8,21.8,21.9,2a00:1288:f00e:1fe::3001
........
什么命令可以帮助我完成这项任务?
答案 0 :(得分:1)
感谢CA网站,这对我来说也很有趣。我在桌子的一角写下了这个,它需要改进。
以下是使用Html-Agility-Pack的方法,在下文中,我假设HtmlAgilityPack.dll位于目录脚本文件的 Html-Agility-Pack 目录中。
# PingFromTheCloud.ps1
$url = "http://cloudmonitor.ca.com/en/ping.php?vtt=1392966369&varghost=www.silogix.fr&vhost=_&vaction=ping&ping=start";
$Path = "c:\temp\Pingtest.htm"
$ie = New-Object -com InternetExplorer.Application
$ie.visible = $true
$ie.navigate($url)
while($ie.ReadyState -ne 4) { start-sleep -s 10 }
#$ie.Document.Body.InnerText | Out-File -FilePath $Path
$ie.Document.Body | Out-File -FilePath $Path
$ie.Quit()
Add-Type -Path "$(Split-Path -parent $PSCommandPath)\Html-Agility-Pack\HtmlAgilityPack.dll"
$webGraber = New-Object -TypeName HtmlAgilityPack.HtmlWeb
$webDoc = $webGraber.Load("c:\temp\Pingtest.htm")
$Thetable = $webDoc.DocumentNode.ChildNodes.Descendants('table') | where {$_.XPath -eq '/div[3]/div[1]/div[5]/table[1]/table[1]'}
$trDatas = $Thetable.ChildNodes.Elements("tr")
Remove-Item "c:\temp\Pingtest.csv"
foreach ($trData in $trDatas)
{
$tdDatas = $trData.elements("td")
$line = ""
foreach ($tdData in $tdDatas)
{
$line = $line + $tdData.InnerText.Trim() + ','
}
$line.Remove($line.Length -1) | Out-File -FilePath "c:\temp\Pingtest.csv" -Append
}