从HTML DOM获取数据到Microsoft Word

时间:2015-10-27 10:00:23

标签: html dom ms-word word-vba

我一直试图将谷歌专利的数据提供给微软的 我正在使用以下VBA来提取信息

 Function Get_URL_Data(StrUrl As String) As String
'References to Internet Explorer & Microsoft HTML required
Dim Browser As SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim StrTmp As String, StrTxt As String
Set Browser = New SHDocVw.InternetExplorer
'Open the web page
Browser.navigate StrUrl
Do While Browser.Busy
  DoEvents
Loop
Set HTMLDoc = Browser.Document
Do While Browser.readyState <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to google.com"
  DoEvents
Loop
'Get the data
On Error Resume Next
StrTmp = HTMLDoc.getElementsByClassName("claim-text")(1)
Get_URL_Data = StrTmp
'Close the browser
Browser.Quit
Set HTMLDoc = Nothing: Set Browser = Nothing
Application.ScreenUpdating = True
End Function

我的html包含li class =&#34;声明&#34;

    <li class="claim"> <div id="c-en-0001" num="0001" class="claim">
    <div class="claim-text">A process for producing a compound represented by formula (II):
<chemistry id="chem0047" num="0047"> <div class="patent-image"> <a href="//patentimages.storage.googleapis.com/EP1925611A1/imgb0047.png"> <img id="ib0047" file="imgb0047.tif" wi="40" he="40" img-content="chem" img-format="tif" src="//patentimages.storage.googleapis.com/EP1925611A1/imgb0047.png" class="patent-full-image" width="160" height="160" alt="Figure imgb0047"> </a> </div> <attachments> <attachment idref="chem0047" attachment-type="cdx" file="CDX"> </attachment> <attachment idref="chem0047" attachment-type="mol" file="MOL"> </attachment> </attachments> </chemistry>
(wherein Y represents -COR, wherein R represents a C1-C8 alkoxy group, a C6-C14 aryloxy group, a C2-C8 alkenyloxy group, a C7-C26 aralkyloxy group, or a di(C1-C6 alkyl)amino group; and R<sup>1</sup> represents a C2-C7 alkoxycarbonyl group), which comprises treating a compound represented by formula (I):
<chemistry id="chem0048" num="0048"> <div class="patent-image"> <a href="//patentimages.storage.googleapis.com/EP1925611A1/imgb0048.png"> <img id="ib0048" file="imgb0048.tif" wi="26" he="34" img-content="chem" img-format="tif" src="//patentimages.storage.googleapis.com/EP1925611A1/imgb0048.png" class="patent-full-image" width="104" height="136" alt="Figure imgb0048"> </a> </div> <attachments> <attachment idref="chem0048" attachment-type="cdx" file="CDX"> </attachment> <attachment idref="chem0048" attachment-type="mol" file="MOL"> </attachment> </attachments> </chemistry>
(wherein Y has the same meaning as defined above) in a solvent with aqueous ammonia or a solution of ammonia in C1-C4 alcohol and, subsequently, with a di(C1-C6 alkyl) dicarbonate.</div>
  </div>

我试图弄清楚如何废弃,但我无法弄清楚使用HTML DOM的内容以及如何废弃

我尝试通过尝试不同的后续工作,但我无法得到结果。 我试图废弃

StrTmp = HTMLDoc.getElementsByClassName("claim-text")
StrTmp = HTMLDoc.getElementById("claim-text")
StrTmp = HTMLDoc.getElementsByName("claim-text")
StrTmp = HTMLDoc.document.getElementsByTagName("claim-text")

也试过报废

StrTmp = HTMLDoc.getElementsByClassName("claim")
StrTmp = HTMLDoc.getElementById("claim")
StrTmp = HTMLDoc.getElementsByName("claim")
StrTmp = HTMLDoc.document.getElementsByTagName("claim")

我还没有得到任何结果。 我是在正确的轨道上吗?

0 个答案:

没有答案