VBA-在JavaScript中获取内容数据表

时间:2019-03-04 04:11:22

标签: html excel vba web-scraping

我在获取td内容时遇到问题。看来使用javascript进行数据显示,需要提取图片数据。

表格数据:

已经尝试使用基本的抓取工具,但无法收集数据。

VBA代码:

post = Trim(doc.getElementsByClassName("chart-content-container")(5).getElementsByTagName("table")(0).getElementsByTagName("tbody")(0).getElementsByTagName("tr")(1).innerText)

我要获取的数据来自此处(需要登录才能查看数据,基本上是使用facebook登录): fanpagedata

html

<div id="id1355">
<span class="chart-container" id="id12c4">
<div class="chart-title">Types of posts</div>
<div class="chart-content-container">
<div class="chart-content-sameInnerHeight" style="margin-top: 50px;" id="id1302">
<div id="id1355">
<div style="position: relative;">
<div id="dashboard9566356">
<div style="position: relative;" id="chartDiv9566356"><div style="position: relative;"><div dir="ltr" style="position: relative; width: 563px; height: 200px;"><div style="position: absolute; left: 0px; top: 0px; width: 100%; height: 100%;" aria-label="A chart."><svg width="563" height="200" aria-label="A chart." style="overflow: hidden;"><defs id="defs"></defs><g><path d="M282,23.5L282,10A90,90,0,0,1,341.68103924167156,32.63403266460091L332.72888335542086,42.73892776491077A76.5,76.5,0,0,0,282,23.5" stroke="#ffffff" stroke-width="1" fill="#c2185b"></path></g><g><path d="M263.6923516820019,25.722950966907007L260.4615902141199,12.615236431655305A90,90,0,0,1,282,10L282,23.5A76.5,76.5,0,0,0,263.6923516820019,25.722950966907007" stroke="#ffffff" stroke-width="1" fill="#f57c00"></path></g><g><path d="M206.0577711314989,109.22105603953229L192.65620133117517,110.84830122297916A90,90,0,0,1,260.4615902141197,12.615236431655333L263.69235168200174,25.722950966907035A76.5,76.5,0,0,0,206.0577711314989,109.22105603953229" stroke="#ffffff" stroke-width="1" fill="#8ecbd3"></path></g><g><path d="M332.72888335542086,42.73892776491077L341.68103924167156,32.63403266460091A90,90,0,1,1,192.6562013311751,110.84830122297903L206.05777113149887,109.22105603953219A76.5,76.5,0,1,0,332.72888335542086,42.73892776491077" stroke="#ffffff" stroke-width="1" fill="#5c6fda"></path></g><g><g><g><text text-anchor="end" x="535" y="20.491718170580963" font-family="Arial" font-size="12" stroke="none" stroke-width="0" fill="#6d90cb">Links</text></g><g><text text-anchor="end" x="535" y="39.90828182941904" font-family="Arial" font-size="12" stroke="none" stroke-width="0" fill="#9e9e9e">11.5%</text></g></g><g><path d="M319.5,26.5L384,26.5L384,26.5L535.5,26.5" stroke="#636363" stroke-width="1" stroke-opacity="0.7" fill-opacity="1" fill="none"></path><circle cx="319.5" cy="26.5" r="2" stroke="none" stroke-width="0" fill-opacity="0.7" fill="#636363"></circle></g><g><g><text text-anchor="end" x="535" y="168.49171817058095" font-family="Arial" font-size="12" stroke="none" stroke-width="0" fill="#6d90cb">Pictures</text></g><g><text text-anchor="end" x="535" y="187.90828182941902" font-family="Arial" font-size="12" stroke="none" stroke-width="0" fill="#9e9e9e">61.5%</text></g></g><g><path d="M321.5,174.5L384,174.5L384,174.5L535.5,174.5" stroke="#636363" stroke-width="1" stroke-opacity="0.7" fill-opacity="1" fill="none"></path><circle cx="321.5" cy="174.5" r="2" stroke="none" stroke-width="0" fill-opacity="0.7" fill="#636363"></circle></g><g><g><text text-anchor="start" x="28" y="54.49171817058097" font-family="Arial" font-size="12" stroke="none" stroke-width="0" fill="#6d90cb">Videos</text></g><g><text text-anchor="start" x="28" y="73.90828182941904" font-family="Arial" font-size="12" stroke="none" stroke-width="0" fill="#9e9e9e">23.1%</text></g></g><g><path d="M209.5,60.5L180,60.5L180,60.5L28.5,60.5" stroke="#636363" stroke-width="1" stroke-opacity="0.7" fill-opacity="1" fill="none"></path><circle cx="209.5" cy="60.5" r="2" stroke="none" stroke-width="0" fill-opacity="0.7" fill="#636363"></circle></g><g><g><text text-anchor="start" x="28" y="19.491718170580963" font-family="Arial" font-size="12" stroke="none" stroke-width="0" fill="#6d90cb">Status</text></g><g><text text-anchor="start" x="28" y="38.90828182941904" font-family="Arial" font-size="12" stroke="none" stroke-width="0" fill="#9e9e9e">3.8%</text></g></g><g><path d="M268.5,18.5L180,18.5L180,25.5L28.5,25.5" stroke="#636363" stroke-width="1" stroke-opacity="0.7" fill-opacity="1" fill="none"></path><circle cx="268.5" cy="18.5" r="2" stroke="none" stroke-width="0" fill-opacity="0.7" fill="#636363"></circle></g></g><g></g></svg><div aria-label="A tabular representation of the data in the chart." style="position: absolute; left: -10000px; top: auto; width: 1px; height: 1px; overflow: hidden;"><table><thead><tr><th>Status</th><th>Wert</th></tr></thead><tbody><tr><td>Links</td><td>3</td></tr><tr><td>Pictures</td><td>16</td></tr><tr><td>Videos</td><td>6</td></tr><tr><td>Status</td><td>1</td></tr></tbody></table></div></div></div><div aria-hidden="true" style="display: none; position: absolute; top: 210px; left: 573px; white-space: nowrap; font-family: Arial; font-size: 12px;">3.8%</div><div></div></div></div>
<div id="control9566356"></div>
</div>
<div id="id13ad" style="display:none"></div>
</div>
<script type="text/javascript">    var chart9566356;	var data9566356;var options9566356;
var vAxisFormat9566356 = 'decimal';function drawVisualization9566356() {options9566356 = {curveType: 'none',  fontName: 'Arial',	 'lineWidth':2,	 'height':200,		backgroundColor: {  fill: 'transparent'  },'legend': {'position': 'labeled', 'alignment': 'center' , textStyle: {color: '#6d90cb'}},		chartArea: {   },tooltip :{ isHtml : 'true'},'interpolateNulls' : true, pieHole: 0.85, pieSliceText: 'none', chartArea: { width: '90%', height: '90%', left: '5%', top: '5%' }, series: {0: {targetAxisIndex: 0,areaOpacity :1.0,type: 'area',dataOpacity :1.0},1: {targetAxisIndex: 0,areaOpacity :1.0,type: 'area',dataOpacity :1.0},2: {targetAxisIndex: 0,areaOpacity :1.0,type: 'area',dataOpacity :1.0},3: {targetAxisIndex: 0,areaOpacity :1.0,type: 'area',dataOpacity :1.0}    },    vAxes: {      0: {format: 'decimal'},      1: {format: 'decimal'}    },colors: ['#c2185b','#5c6fda','#8ecbd3','#f57c00'],animation:{   duration: 500,  easing: 'out', },hAxis:{		textStyle: {  },gridlines: {color: 'transparent'},'baselineColor': '#ccc'},vAxis: {format: vAxisFormat9566356,		textStyle: {  },gridlines: {color: 'transparent'},'baselineColor': '#ccc'}    };
chart9566356 = new google.visualization.PieChart(document.getElementById('chartDiv9566356'));
data9566356 = new google.visualization.DataTable();data9566356.addColumn('string', 'Status');    data9566356.addColumn('number', 'Wert');data9566356.addColumn({'type': 'string', 'role': 'tooltip', 'p': {'html': true}});	data9566356.addColumn({type:'string', role:'annotationText'});	data9566356.addColumn({type:'string', role:'annotationText'});	data9566356.addColumn({type:'string', role:'annotationText'}); data9566356.addRows([['Links',3.0,'<div class="name">Links</div><div class="value">3</div>',null,'0','link'],['Pictures',16.0,'<div class="name">Pictures</div><div class="value">16</div>',null,'0','photo'],['Videos',6.0,'<div class="name">Videos</div><div class="value">6</div>',null,'0','video'],['Status',1.0,'<div class="name">Status</div><div class="value">1</div>',null,'0','status']]);
var numberFormatter = new google.visualization.NumberFormat({ fractionDigits: 0 }); for (var i = 1; i < data9566356.getNumberOfColumns(); i++) {		if(data9566356.getColumnType(i) === "number"){			numberFormatter.format(data9566356, i);		}}
var selectListener9566356 = google.visualization.events.addListener(chart9566356, 'select',function(){	  var selection = chart9566356.getSelection();	  for (var i = 0; i < selection.length; i++) {	     var item = selection[i];	     if (item.row != null && item.column != null) {var idsColumnNumber = item.column +2;var id = data9566356.getValue(item.row, idsColumnNumber);if(id != null) { id = id.replace('#', ''); }	Wicket.Ajax.get({ u:  './ClashofClans?38-1.IBehaviorListener.0-fanPageKarmaPanel-resultContainer-graphenPanel-graphenTimesAndTypesPanelContainer-graphenTimesAndTypesPanel-content-postTypePanel-postTypePanel'+'&id='+id+'&datum='+data9566356.getValue(item.row, idsColumnNumber+1)+'&anzahl='+data9566356.getValue(item.row, item.column)+'&page_identifier='+data9566356.getValue(item.row, idsColumnNumber+2)+'&spalte='+item.column});	    } else {	       try {var id = data9566356.getValue(item.row, 7);	          Wicket.Ajax.get({ u: './ClashofClans?38-1.IBehaviorListener.0-fanPageKarmaPanel-resultContainer-graphenPanel-graphenTimesAndTypesPanelContainer-graphenTimesAndTypesPanel-content-postTypePanel-postTypePanel'+'&id='+id+'&page_identifier='+id});         } catch(e) {            var id = data9566356.getValue(item.row, 5);	          Wicket.Ajax.get({ u: './ClashofClans?38-1.IBehaviorListener.0-fanPageKarmaPanel-resultContainer-graphenPanel-graphenTimesAndTypesPanelContainer-graphenTimesAndTypesPanel-content-postTypePanel-postTypePanel'+'&id='+id+'&page_identifier='+id});         }     }   } });placeMarker9566356 = function(graph , dataTable) {
};placeMarkerListener9566356 = google.visualization.events.addListener(chart9566356
, 'ready',placeMarker9566356.bind(chart9566356
, chart9566356
, data9566356));
chart9566356.draw(data9566356,options9566356);}function reDrawChart9566356() { if (!$('#chartDiv9566356').is(':visible')) { return; }	google.visualization.events.removeListener(placeMarkerListener9566356);
	placeMarkerListener9566356 = google.visualization.events.addListener(chart9566356
, 'ready', placeMarker9566356.bind(chart9566356
, chart9566356
, data9566356));
options9566356 = {curveType: 'none',  fontName: 'Arial',	 'lineWidth':2,	 'height':200,		backgroundColor: {  fill: 'transparent'  },'legend': {'position': 'labeled', 'alignment': 'center' , textStyle: {color: '#6d90cb'}},		chartArea: {   },tooltip :{ isHtml : 'true'},'interpolateNulls' : true, pieHole: 0.85, pieSliceText: 'none', chartArea: { width: '90%', height: '90%', left: '5%', top: '5%' }, series: {0: {targetAxisIndex: 0,areaOpacity :1.0,type: 'area',dataOpacity :1.0},1: {targetAxisIndex: 0,areaOpacity :1.0,type: 'area',dataOpacity :1.0},2: {targetAxisIndex: 0,areaOpacity :1.0,type: 'area',dataOpacity :1.0},3: {targetAxisIndex: 0,areaOpacity :1.0,type: 'area',dataOpacity :1.0}    },    vAxes: {      0: {format: 'decimal'},      1: {format: 'decimal'}    },colors: ['#c2185b','#5c6fda','#8ecbd3','#f57c00'],animation:{   duration: 500,  easing: 'out', },hAxis:{		textStyle: {  },gridlines: {color: 'transparent'},'baselineColor': '#ccc'},vAxis: {format: vAxisFormat9566356,		textStyle: {  },gridlines: {color: 'transparent'},'baselineColor': '#ccc'}    };
chart9566356.draw(data9566356,options9566356);}$('body').off('redrawAfterResize.chart9566356').on('redrawAfterResize.chart9566356', function() { reDrawChart9566356(); }); if (typeof googleChartApiLoaded !== 'undefined' && googleChartApiLoaded) {	 drawVisualization9566356();} else {	 google.setOnLoadCallback(drawVisualization9566356);}</script>
</div>
</div>
<p class="chart-footer-note">Here you can see the mixture of post types. Find out which types of posts are used most often in the selected time period. Generally speaking, it is good advice to try out all post types and entertain fans with a good mixture.</p>
</div>
</span>

请,有人知道如何抓取吗?

1 个答案:

答案 0 :(得分:0)

您可以收集所有td元素并循环测试单词Pictures的存在,然后在下一个索引处取节点:

Dim list As Object, i As Long
Set list = doc.querySelectorAll(".chart-content-container td")
For i = 0 To list.Length - 1
    If Trim$(list.Item(i).innerText) = "Pictures" Then
        Debug.Print list.Item(i + 1).innerText
        Exit For
    End If
Next