如何从Cheerio的棘手元素中获取属性?

时间:2018-02-09 05:54:10

标签: node.js cheerio

您好我试图从网页中提取一些信息,这有点棘手。 我需要信息的元素看起来像这样;

<div id="1449822" class="match_line score_row other_match e_true " data-
cntr="0" data-parent-competition="A-LEAGUE" data-note="Venue: Etihad Stadium. 
Turf: Natural. Capacity: 56,347. Distance: 1,667km. Sidelined Players: 
MELBOURNE VICTORY - AUSTIN MITCHELL, DENG THOMAS, NIGRO STEFAN (Injured). 
BRISBANE ROAR FC - BROWN COREY, DE VERE LUKE, O TOOLE CONNOR, THEO MICHAEL, 
CALETTI JOE, D AGOSTINO NICHOLAS (Injured)." data-competition-name="A-LEAGUE" 
data-league-type="LEAGUE" data-season="2017/2018" data-statustype="sched" 
data-ko="09:50" data-home-team="MELBOURNE VICTORY" data-away-team="BRISBANE 
ROAR FC" data-league-sort="11" data-correction="0" data-matchday="2018-02-09" 
data-game-status="Sched" data-league-code="41256" data-league-name="A-LEAGUE" 
data-country-name="AUSTRALIA" data-league-round="20" data-league-short="AL" 
data-home-id="28529" data-away-id="28531" data-ftr="false">

我特别感兴趣的内容是什么:

 data-season= 
 data-note=
 data-league-name=
 data-country-name=
 data-home-team=
 data-Away-team=

但我不确定如何获取此信息是我尝试过的

var http = require('http');
var request = require('request');
var cheerio = require('cheerio');

http.createServer(function (req, res) {
  request('http://www.xscores.com/soccer', function (error, response, html) {
    if (!error && response.statusCode == 200) {
      var $ = cheerio.load(html);
      var list_items = "";

      $('div.match_line.score_row.other_match.e_true').each(function (i, element) {
        var a = $(this).text();
        list_items += "<li>" + a + "</li>";
      });

      var html = "<ul>" + list_items + "</ul>"
      res.writeHead(200, {
        'Content-Type': 'text/html'
      });
      res.end(html);
    }
  });
}).listen(8080);
console.log('Server is running at http://178.62.253.206:8080/');

然而,看起来上面的代码不能获取此元素中的内容,而是从此下的所有div元素中获取信息,以下是我的代码返回的内容:http://178.62.253.206:8080/

10:50 SCH SHOW GAMES FROM AUSTRALIA AL MELBOURNE VICTORY 5 Â  Â  BRISBANE 
ROAR FC 7 Â  Â  Match Details

非常感谢任何有关此事的帮助

frederik

1 个答案:

答案 0 :(得分:0)

const cheerio = require('cheerio')
const divElement = `your_div_element`;
const $ = cheerio.load(divElement);

$(divElement).map((index, element) => {
    const attributes = element.attribs;
    Object.keys(attributes).map(key => {
        console.log(key, ': ', attributes[key]);
    })
});

我创建了这个简单的脚本,它将帮助您获取所有属性名称及其值,您可以过滤出您感兴趣的一个。