Cheerio和js的新手。我试图将所有投手名称及其相关统计信息写入JSON对象,如下所示:
var pitchers = {
name: 'Just Verlander',
era: 6.62
etc...
etc...
}
这是我试图抓取的HTML:
<tr class="">
<td class="stat-name-width"><img src="../../style/assets/img/mlb/team-logos/tigers.png" height="20"/>
<span class="pitcher-name">Justin Verlander</span>
<div class="fantasy-blue inline fantasy-data pitcher-salary-fd">$7,100</div>
<small class="text-muted pitches">(R)</small>
<small class="text-muted matchup">(@ BOS)</small></td>
<td class="stat-stat-width fantasy-blue fantasy-points">
<td class="stat-stat-width">0-3</td>
<td class="stat-stat-width">6.62</td>
<td class="stat-stat-width">1.50</td>
<td class="stat-stat-width">5.82</td>
<td class="stat-stat-width">3.18</td>
<td class="stat-stat-width">2.12</td>
<td class="stat-stat-width">5.67</td>
<td class="stat-stat-width">1.03x</td>
<td class="stat-stat-width">0.96x</td>
<td class="stat-stat-width">1.09x</td>
<td class="stat-stat-width">0.90x</td>
</tr>
在同一页面上大约有30名投手具有相同的结构。
这是我到目前为止所做的:
test = $(&#39; span.pitcher-name&#39;)。text();给了我所有的投手名字,而不只是一个。
显然我甚至没有关闭......我无法弄清楚如何让投手名字的孩子与javascript对象联系......任何帮助都是非常感谢!
答案 0 :(得分:1)
看起来你想要的是$()。each()函数。
使用此函数,您可以遍历标记的每个实例并执行回调函数,如下所示:
var someObjArr = [];
$('span.pitcher-name').each(function(i, element){
//Get the text from cheerio.
var text = $(this).text();
//if undefined, create the object inside of our array.
if(someObjArr[i] == undefined){
someObjArr[i] = {};
};
//Update the name property of our object with the text value.
someObjArr[i].name = text;
});
$('div.pitcher-salary-fd').each(function(i, element){
//Get the text from cheerio.
var text = $(this).text();
//if undefined, create the object inside of our array.
if(someObjArr[i] == undefined){
someObjArr[i] = {};
};
//Update the salary property of our object with the text value.
someObjArr[i].salary = text;
});
console.log(someObjArr); //[ { name: 'Justin Verlander', salary: '$7,100' } ]
关于此功能的最佳部分之一是它同步工作,因此它与for循环相似并且易于理解。
请记住,您可以在回调的$(this)部分打印出每个子元素。这在您需要确定需要作为标记放置的特定事物的情况下特别有用。例如:
$('span.pitcher-name').each(function(i, element){
//Return the entire element.
var pitcherNameElement = $(this);
//Prints all of the element's properties.
console.log(pitcherNameElement);
});
现在,为了检索更抽象的东西,比如同一个表行中的项目数组,事情变得稍微复杂一些。为了做到这一点,我们需要在表行上使用$()。每个函数,然后检查每个子类的匹配项。这样,我们可以使用相同的索引。
$('tr').each(function(i, element){
//get all children of a table row
var children = $(this)['0'].children;
//this array will hold the matchup data
var matchupArr = [];
//class to extract
var statClass = 'stat-stat-width';
//for loop-ing the children
for(var myInt=0; myInt<children.length; myInt++){
//the next element of this child
var next = children[myInt].next;
//sometimes next is undefined
if(next != undefined){
//get the html attribs of the next element
var attribs = next.attribs;
//sometimes the next element has no attribs
if(attribs != undefined){
//class of the next element
var myClass = attribs.class;
//if the next element's class if the one we want
if(myClass == statClass){
//push it to our matchup array
matchupArr.push(next.children[0].data);
};
};
};
};
//if undefined, create the object inside of our array.
if(someObjArr[i] == undefined){
someObjArr[i] = {};
};
//Update the matchup property of our object with our array.
if(matchupArr.length >0){
someObjArr[i].matchups = matchupArr;
};
});
这有点像黑客,但它显示了潜在的概念。允许您在父P中对所有子C执行回调的方法将是对库的一个很好的补充。但是,唉,我们生活在一个不完美的世界。
祝你好运,快乐刮刮!
答案 1 :(得分:0)
你见过the documentation吗?如果你失败了,有很多关于如何遍历网站元素的例子。
例如:
$('#span.pitcher-name').next()
//{['<small class="text-muted pitches">(R)</small>']}