使用Powershell从html文件获取数据并导出到文本文件

时间:2019-07-10 08:25:25

标签: html powershell

我是PowerShell的初学者,我需要脚本方面的帮助。我有一个html文件报告,我需要从该html文件中以Time:Web Site:的形式获取具有结构(分隔符为;$)的文本文件的数据,但是Web的标题网站(url)不在html文件中。 如果可以的话,如何从url在线导入?

文本文件的结果(输出):
时间; $ Web站点”; $标题(1. url-第一行)
时间; $ Web站点”; $标题(2. url-新行)
时间; $ Web站点”; $标题(3. url-新行)
时间; $ Web站点”; $标题(4. url-新行等)

例如:

2019-07-10 10:02:05;$https://www.google.com;$Google

我需要的时间格式:2019-07-10 10:02:05
yyyy-mm-dd时间,否:07/10/2019 10:02:05与html文件一样)

谢谢。

html文件的结构为:

    <html>
<head><title>Report</title>
    <style>
        table{margin:1 auto}td{padding-left:45;padding-right:20;vertical-align:top}th{padding-left:10;background-color:efefef;border:#aaaaaa 1px solid;border-left:0;border-right:0}.TC{border:#aaaaaa 1px solid;border-top:0;table-layout:fixed;word-wrap:break-word;width:90%;text-align:left}
    </style>
</head>
<body style='text-align:center' vLink=#0066ff aLink=#0066ff link=#0066ff>
<div align=center><font size=4 color=#3399ff>Report</font>
    <hr size=1 color=#99ccff>
</div>
<p></p>
<table cellspacing=0 cellpadding=5 border=1 bordercolor=aaaaaa style='border-collapse:collapse;width:70%'>
    <colgroup>
        <col width=30%/>
        <col width=70%/>
    </colgroup>
    <tr>
        <td><b>Report Date:</b></td>
        <td>22.11.2018 7:23:27</td>
    </tr>
    <tr>
        <td><b>Report Type:</b></td>
        <td>Report of web pages</td>
    </tr>
    <tr>
        <td colspan=2 align='right'><a href='javascript:window.print()'><b>Print This Report</b></a></td>
    </tr>
</table>
<p></p>
<table id=tblLog cellspacing=0 class=TC>
    <colgroup>
        <col width=20%/>
        <col width=80%/>
    </colgroup>
    <tr>
        <th colspan=2><b>&nbsp;</b></th>
    </tr>
    <tr>
        <td><b>Time:</b></td>
        <td>11/22/2018 07:09:23</td>
    </tr>
    <tr>
        <td><b>User:</b></td>
        <td>Teddy</td>
    </tr>
    <tr>
        <td><b>Web Site:</b></td>
        <td><a href=' google.com '> google.com </a></td>
    </tr>
    <tr>
        <td></td>
        <td align=right><b><a href=#top>Top</a></b></td>
    </tr>
    <tr>
        <th colspan=2><b>&nbsp;</b></th>
    </tr>
    <tr>
        <td><b>Time:</b></td>
        <td>11/22/2018 07:09:24</td>
    </tr>
    <tr>
        <td><b>User:</b></td>
        <td>Teddy</td>
    </tr>
    <tr>
        <td><b>Web Site:</b></td>
        <td><a href=' https://www.yahoo.com '> https://www.yahoo.com </a></td>
    </tr>
    <tr>
        <td></td>
        <td align=right><b><a href=#top>Top</a></b></td>
    </tr>
    <tr>
        <th colspan=2><b>&nbsp;</b></th>
    </tr>
    <tr>
        <td><b>Time:</b></td>
        <td>11/22/2018 07:13:23</td>
    </tr>
    <tr>
        <td><b>User:</b></td>
        <td>Teddy</td>
    </tr>
    <tr>
        <td><b>Web Site:</b></td>
        <td><a href=' https://stackoverflow.com '> https://stackoverflow.com </a></td>
    </tr>
    <tr>
        <td></td>
        <td align=right><b><a href=#top>Top</a></b></td>
    </tr>
    <tr>
        <th colspan=2><b>&nbsp;</b></th>
    </tr>
    <tr>
        <td><b>Time:</b></td>
        <td>11/22/2018 07:13:24</td>
    </tr>
    <tr>
        <td><b>User:</b></td>
        <td>Teddy</td>
    </tr>
    <tr>
        <td><b>Web Site:</b></td>
        <td><a href=' https://www.dooms.eu/?redirected=1542870767 '> https://www.dooms.eu/?redirected=1542870767 </a>
        </td>
    </tr>
    <tr>
        <td></td>
        <td align=right><b><a href=#top>Top</a></b></td>
    </tr>
</table>
<p align=center><font size=2>© 2004 - 2018</font></p></body>
</html>

0 个答案:

没有答案