我需要使用powershell获取网页表格的选定列值。我的代码是
$url1 = "D:\MyPowershell\Html.html"
$ie = New-Object -com internetexplorer.application;
$ie.visible = $true;
$ie.navigate($url1);
while ($ie.Busy -eq $true)
{
Start-Sleep 10;
# $ie.refresh();
}
$a = Get-Content "D:\MyPowershell\Html.html"
$x=($ie.document.getElementsByTagName("tr"))`
| where {
($_.innerText -match "2 - High") -and
$_.innerText -notmatch "Work in Progress" }`
| % {
$number, $priority, $state= $_.children | select -Expand innerText
New-Object -Type PSObject -Property @{
'Number' = $number
'Priority' = $priority
'State' = $state
}
} | select Number, Priority|Export-csv 'D:\Html.csv' -NoType -Delimiter "`t"
这是我创建的网页。没有其他表格网页的HTML代码是:
<html>
<head>
</head>
<body>
<table>
<thead>
<tr>
<th name="number"> Number</th>
<th name="priority">Priority</th>
<th name="state">State</th>
</tr>
</thead>
<tbody>
<tr>
<td name="check_task" class="list_checkbox ">INC0811168</td>
<td class="vt" title="" style="background-color:orange">2 - High</td>
<td style="" class="vt" title="">Assigned</td>
</tr>
<tr>
<td name="check_task" class="list_checkbox ">INC081rr68</td>
<td class="vt" title="">0 - None</td>
<td style="" class="vt" title="">Work in Progress</td>
</tr>
</tbody>
</table>
</body>
</html>
使用这个编辑过的代码,我没有错误,但是这些值没有显示在csv中,只创建了'Html'文件。相反,我在记事本中得到了一些编码文本。这里INC0811168,'2 -High'和'Assigned'是单独的字段..我需要获取单独列值之间的空格数据。我需要过滤数据,只需要获得'INC ****'和'2-High'列。检索到的数据将导出到csv。我怎么能这样做?
答案 0 :(得分:0)
扩展行的内部文本会将各个字段分解为单个字符串。请改用这样的东西:
$ie.document.getElementsByTagName("tr") | where {
$_.innerText -match "2 - High" -and
$_.innerText -notmatch "Work in Progress"
} | % {
$id, $priority, $status, $summary = $_.children | select -Expand innerText
New-Object -Type PSObject -Property @{
'ID' = $id
'Priority' = $priority
'Status' = $status
'Summary' = $summary
}
} | select ID, Priority | Export-Csv 'D:\Html.csv' -NoType -Delimiter "`t"
将`t
(制表符)替换为您希望CSV使用的分隔符。