Question

我正在使用 HP Operations Orchestration ，我正在创建一个流程，它应该读取TXT文件（包含HTML源代码）并从中获取信息。我想要的是<h2></h2>标签之间的数据。

不幸的是，当我启动流程时，操作组件“从文件读取”启动无限循环并且不停地工作。

也许有人使用该程序并知道如何获取我想要的数据？如果您需要更多详细信息，请在评论中写下。

一些信息：

我使用PowerShell脚本获得了源代码
我不使用任何JavaScript脚本（但我可以添加它们）

Answer 1

这应该有效：

$data >> "$env:USERPROFILE/sc.txt"
$result     =       Get-Content "$env:USERPROFILE/sc.txt" | foreach { if ($_ -match "<h2>(.*?)</h2>"){  $matches[1]}}

Answer 2

@Micky Balladelli

这是我的PowerShell代码：

Clear-Host

# Create TXT file that containing source code
If (Test-Path "$env:USERPROFILE/sc.txt")
{
 Remove-Item "$env:USERPROFILE/sc.txt"
}
New-Item -name "sc.txt" -path "$env:USERPROFILE" -type file

If (Test-Path "$env:USERPROFILE/titles.txt")
{
 Remove-Item "$env:USERPROFILE/titles.txt"
}
New-Item -name "titles.txt" -path "$env:USERPROFILE" -type file

# Create an Internet Explorer com object
$URL        =       "geekweek.pl"
$wc         =       New-Object System.Net.WebClient
$ie         =       New-Object -com InternetExplorer.Application
$ie.visible =       $true
$ie.navigate($URL)
while ($ie.busy)
{
 start-sleep -second 10
}

$doc        =       $ie.Document
$data       =       $wc.DownloadString("http://www.geekweek.pl")

$data >> "$env:USERPROFILE/sc.txt"

$result     =       Get-Content "$env:USERPROFILE/sc.txt" | foreach { $_ -split "<h2>(.*?)</h2>" -join ''}

$result >> "$env:USERPROFILE/titles.txt"

$ie.Quit()

其中：

sc.txt - 包含＆＃34; geekweek.pl＆＃34;
titles.txt - 需要保存<h2></h2>

其实我在titles.txt

中有完整的代码

如何从TXT文件中获取数据？

2 个答案: