PreviousSibling不使用querySelector

时间:2019-07-14 08:02:28

标签: html excel vba

我正在尝试从位于C:\ Sample.html的本地html文件中提取两个部分 我从类似的另一个线程中使用了@QHarr代码

Sub Test()
Dim html As HTMLDocument, post As Object, i As Long

Set html = GetHTMLFileContent("C:\Sample.html")
Set post = html.querySelectorAll("span.course-player__chapter-item__completion")

For i = 0 To post.Length - 1
    ActiveSheet.Cells(i + 1, 1) = Trim(post.item(i).innerText)
    ActiveSheet.Cells(i + 1, 2) = post.item(i).PreviousSibling.innerText
Next i
End Sub

Function GetHTMLFileContent(ByVal filePath As String) As HTMLDocument
Dim fso As Object, hFile As Object, hString As String, html As HTMLDocument

Set html = New HTMLDocument
Set fso = CreateObject("Scripting.FileSystemObject")
Set hFile = fso.OpenTextFile(filePath)

Do Until hFile.AtEndOfStream
    hString = hFile.ReadAll()
Loop

html.body.innerHTML = hString
Set GetHTMLFileContent = html
End Function

代码可以正常工作,并可以获取post.item(i).innerText部分中元素的内部文本。 但是,当尝试获取上一个兄弟姐妹的内文时,它不会返回任何内容

这是html的快照

enter image description here

<div class="course-player__chapter-item__header _chapter-item__header_d57kmg ui-accordion-header ui-corner-top ui-state-default ui-accordion-icons ui-accordion-header-active ui-state-active" role="tab" id="ui-id-1" aria-controls="ui-id-2" aria-selected="true" aria-expanded="true" tabindex="0"><span class="ui-accordion-header-icon ui-icon ui-icon-triangle-1-s"></span>
  <h2 tabindex="-1" class="course-player__chapter-item__title _chapter-item__title_d57kmg">
    <span class="course-player__progress _chapter-item__progress_d57kmg">
      <span data-percentage-completion="100" class="_chapter-item__progress-ring_d57kmg">
        <span class="progress-ring__ring _progress-ring__ring_jgsecr">
	<span class="progress-ring__mask progress-ring--full _progress-ring__mask_jgsecr _progress-ring--full_jgsecr">
		<span class="progress-ring--fill brand-color__background _progress-ring--fill_jgsecr"></span>
	</span>
	<span class="progress-ring__mask progress-ring--half _progress-ring__mask_jgsecr ">
		<span class="progress-ring--fill brand-color__background _progress-ring--fill_jgsecr"></span>
		<span class="progress-ring--fill progress-ring--fix _progress-ring--fill_jgsecr _progress-ring--fix_jgsecr"></span>
	</span>
</span>
<span class="progress-ring__ring-inset _progress-ring__ring-inset_jgsecr"></span>
<span class="progress-ring__checkmark brand-color__text _progress-ring__checkmark_jgsecr"><i aria-label="Completed" class="toga-icon toga-icon-checkmark"></i></span>

      </span>
    </span>

    INTRO TO VBA - Overview

<!---->
    <span class="course-player__chapter-item__completion _chapter-item__completion_d57kmg">
      10 / 10
    </span>

    <span class="course-player__chapter-item__toggle _chapter-item__toggle_d57kmg">
      <i aria-hidden="true" class="chapter-item__toggle-icon toga-icon toga-icon-caret-stroke-down _chapter-item__toggle-icon_d57kmg"></i>
    </span>

  </h2>
</div>

1 个答案:

答案 0 :(得分:0)

我使用了CSS选择器,该选择器使用h2[class='course-player__chapter-item__title _chapter-item__title_d57kmg']返回所有值,然后将输出分为两列

Sub Test()
Dim x, html As HTMLDocument, post As Object, s As String, i As Long

Set html = GetHTMLFileContent("C:\Sample.html")
Set post = html.querySelectorAll("h2[class='course-player__chapter-item__title _chapter-item__title_d57kmg']")

For i = 0 To post.Length - 1
    x = Split(Trim(post.item(i).innerText), " ")
    s = Join(Array(x(UBound(x)), x(UBound(x) - 1), x(UBound(x) - 2)), " ")
    ReDim Preserve x(0 To UBound(x) - 3)

    ActiveSheet.Cells(i + 1, 1) = Trim(Join(x, " "))
    ActiveSheet.Cells(i + 1, 2) = Trim(s)
Next i
End Sub