从网站获取数据/文本到HTA

时间:2017-10-01 18:45:52

标签: javascript vbscript web-scraping hta

我正在编写HTA,我可以根据公司注册号从外部网站查找公司数据(公司名称,实体类型等)。

由于我在HTA编程,我正在寻找支持的解决方案。我在JavaScript,jQuery和VBScript中尝试了不同的脚本,但似乎没有一个在HTA中起作用(一些在JSFiddle中工作,但在HTA中没有)。

我有以下网址:https://datacvr.virk.dk/data/visenhed?enhedstype=virksomhed&id=24256790。 (注意8位数代码,即注册号)。

我想提供以下文字:

Novo Nordisk A/S
Virksomhedsform: Aktieselskab

希望有人知道如何获取所请求的数据。

更新#2:以下是我的完整HTA代码:

<html>
<HTA:APPLICATION ID="Company Data" APPLICATIONNAME="Company Data" BORDER="thick" CAPTION="yes" ICON=images\icon.ico MAXIMIZEBUTTON="yes" MINIMIZEBUTTON="yes" SHOWINTASKBAR="yes" SINGLEINSTANCE="no" SYSMENU="yes" RESIZE="yes" contextMenu=no></HTA:APPLICATION> 
<head>
<title>Regnskabskommetar</title>
<link href="include/stylesheet.css" rel="stylesheet" type="text/css" />
<link rel="SHORTCUT ICON" href="images/icon.ico"/>
<script type="text/javascript" charset="utf-8" src="include/jquery-1.7.min.js"></script>
<script type="text/javascript" charset="utf-8" src="include/underscore-min.js"></script>
<script type="text/javascript" charset="utf-8" src="include/autoNumeric-1.9.18.js"></script>
<script type="text/javascript" charset="utf-8" src="include/addFormat.js"></script>

<script>
function init()
{
var input = document.getElementById("cvr_nr").focus();
}
</script>

<script language="vbscript">
Set fso = CreateObject("Scripting.FileSystemObject")
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = false
ie.Navigate("https://datacvr.virk.dk/data/visenhed?enhedstype=virksomhed&id=24256790")

Dim dteWait 
dteWait = DateAdd("s", 1, Now())
Do Until (Now() > dteWait)
Loop

Set Table = ie.document.getElementsByClassName("table stamdata")
For x = 0 to (Table.length)-1
    Data = Data & Table(x).innerText
Next
ie.Quit()
MyFile = "DataLog.txt"
If fso.FileExists(MyFile) Then 
    fso.DeleteFile(MyFile)
End If

WriteTextFile Data, MyFile, -1
set ws = createObject("wscript.shell")
ws.run MyFile

Sub WriteTextFile(sContent, sPath, lFormat)
        ' lFormat -2 - System default, -1 - Unicode, 0 - ASCII
        With CreateObject("Scripting.FileSystemObject").OpenTextFile(sPath, 8, True, lFormat)
            .WriteLine sContent
            .Close
        End With
End Sub
</script>

<script type="text/javascript">
function reloadpage() {
    location.reload();
}
</script>

<script language="vbscript">
resizeto (screen.width)/2,(screen.height - 40) // 40 is the height of task bar
moveto (screen.width)/2,0
</script>

</head>
<body onLoad="init()" language="vbscript">

<table width="100%" border="0" cellpadding="0" cellspacing="0" style='margin-bottom: 5px;' id="sticky_navigation">
<tr>
<td height="40" id="top_bar" style="padding-left: 10px;">Company Data</td>
<td height="40" id="top_bar" align="right"><a href="include/Help.pdf" class="help">Help</a></td>
<td width="10" id="top_bar" align="right" style="padding-right: 10px;"><a href="#" tabindex="-1" onClick="reloadpage()"><img src="images/footer-logo.png" border="0" title="Opdatér" /></a></td>
</tr>
</table>

<table border="0" width="98%" cellpadding="0" cellspacing="0" style="margin-left: 10 px">
<tr>
<td width="40%">
<table border="0" width="100%" cellpadding="0" cellspacing="0">
<tr>
<td width="110">
<b>CVR.</b><br>
<input name="cvr_nr" id="cvr_nr" title="CVR - Kan angives med og uden 00 foran" onchange="" style="text-align: left" size="12" type="number" required></td>
</td>
<td valign="top">
<b >Virksomhedsnavn</b><br>
<input style="text-align: left" value="Novo Nordisk A/S" size="50"></input>


<td valign="top">
<b>Virksomhedsform</b><br>
<input style="text-align: left" value="Aktieselskab" size="22" disabled></input>

</tr>
</table>

<div id="include_facility" class="switchcontent1"></div>

<br>
<table border="0" width="100%" cellpadding="0" cellspacing="0">
<tr>
<td style="padding-top:5px">
<hr>
<button id="Scraper" onclick="Scraper()" name="Scraper" tabindex="1">Get company</button>
<hr>
</td
</tr>
</table>
<br>
<div id="content1"></div>

</body>
</html>

1 个答案:

答案 0 :(得分:2)

由于您未提供任何代码,请尝试使用此vbscript:

Set fso = CreateObject("Scripting.FileSystemObject")
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = false
ie.Navigate("https://datacvr.virk.dk/data/visenhed?enhedstype=virksomhed&id=24256790")
  Do until ie.ReadyState = 4
     WScript.Sleep 50
  Loop

Set Table = ie.document.getElementsByClassName("table stamdata")
For x = 0 to (Table.length)-1
    Data = Data & Table(x).innerText
Next
ie.Quit()
MyFile = "DataLog.txt"
If fso.FileExists(MyFile) Then 
    fso.DeleteFile(MyFile)
End If

WriteTextFile Data, MyFile, -1
set ws = createObject("wscript.shell")
ws.run MyFile

Sub WriteTextFile(sContent, sPath, lFormat)
        ' lFormat -2 - System default, -1 - Unicode, 0 - ASCII
        With CreateObject("Scripting.FileSystemObject").OpenTextFile(sPath, 8, True, lFormat)
            .WriteLine sContent
            .Close
        End With
End Sub