为什么我似乎无法使用HTML文档构建jsdom对象?

时间:2017-05-16 14:06:42

标签: javascript node.js http jsdom

目标:
通过外部网页上的标签提取值。

方式:
执行HTTP请求并使用响应构造jsdom对象。使用jsdom的查询选择器从标记中获取值。

问题:
当我尝试访问任何代码的值时... console.log(dom.window.document.querySelector("h4").textContent); ...我收到错误:"无法读取属性' textContent' of null"。

这必然意味着jsdom对象由于chunk参数的问题而未正确构造(块是响应对象的字符串)。

讨论:
我的猜测是,响应块中的引用转义存在问题,但我的正则表达式尝试没有看到任何结果。如果我传递一个像dom.window.document.querySelector("h4").textContent这样的简单字符串,<html><body><h4>testing</h4></body></html>工作正常。

守则的重要部分:

res.on('data', (chunk) => {
    console.log(typeof(chunk)); // string
    const dom = new JSDOM(chunk);
    console.log(dom.window.document.querySelector("h4").textContent);

  });

所有代码:

var querystring = require('querystring');
var http = require('http');
const jsdom = require("jsdom");
const { JSDOM } = jsdom;

const postData = querystring.stringify({
  'id': '1'
});

const options = {
  hostname: 'www.southernnbtruckers.ca',
  port: 80,
  path: '/search/info/6',
  method: 'POST',
  headers: {
    'Content-Type': 'application/x-www-form-urlencoded',
    'Content-Length': Buffer.byteLength(postData)
  }
};

const req = http.request(options, (res) => {
  res.setEncoding('utf8');

  //Problem is likely to do with the HTTP response (chunk)
  res.on('data', (chunk) => {
    console.log(typeof(chunk)); // string
    const dom = new JSDOM(chunk);
    console.log(dom.window.document.querySelector("h4").textContent); //Cannot read property 'textContent' of null

  });
  res.on('end', () => {
      //Do stuff
  });
});

req.on('error', (e) => {
  console.error(`problem with request: ${e.message}`);
});

req.write(postData);
req.end();  

HTML代码供参考:

<html>
<head>
    <base href="http://www.southernnbtruckers.ca/">
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
    <meta name="generator" content="People and Groups"/>
    <meta name="keywords" content=""/>
    <title>TRUCKERS - Search</title>
    <link rel="stylesheet" type="text/css" href="/core/styles/style_custom.php?org_name=truckers&language=english"/>
</head>
<body><a name="top"></a>
<div id="Container">
    <div id="Header">
        <table id="Lang">
            <tr>
                <td valign="middle">&nbsp;&nbsp;&nbsp;<a href="/login">Login</a></td>
            </tr>
        </table>
    </div>
    <div id="MainNav">
        <div id="Nav">
            <ul>
                <li class="LeftSelectedNav"><a href="/search">Home</a></li>
                <li><a href="/contact_information">Contact</a></li>
                <li style="float:right;" class="right">&nbsp;</li>
            </ul>
        </div>
    </div>
    <div id="MainContent">
        <div id="SideNav">
            <div id="Sub">
                <ul>
                    <li id="SubTitle">Home</li>
                    <li class="subsub"><a href="/our_mission_statemnt">Our Mission Statement</a
                    ></li>
                    <li class="subsub"><a href="/our_executive">Our Executive</a></li>
                    <li class="currentSub"><a href="/search">Search</a></li>
                    <li id="SubSpacer"></li>
                </ul>
                <ul>
                    <li class="SubBlank"><h4>PO Box 342, Harvey, York Co. NB E6K 3W9</h4>
                        <center>
                            <table class="featureImageTable">
                                <tr>
                                    <td><img src="/uploads/Website_Assets/truckers-sidetest.jpg" alt="Side Test"
                                             title=""/></td>
                                </tr>
                                <tr>
                                    <td></td>
                                </tr>
                            </table>
                        </center>
                        <br/><br/><a href="http://www.partsfortrucks.com" target="_tab">
                            <center>
                                <table class="featureImageTable">
                                    <tr>
                                        <td><img src="/uploads/Website_Assets/PartTrucks.jpg" alt="Parts Trucks 300px"
                                                 title=""/></td>
                                    </tr>
                                    <tr>
                                        <td></td>
                                    </tr>
                                </table>
                            </center>
                        </a></p></li>
                    <li id="SubSpacer"></li>
                </ul>
            </div>
        </div>
        <div id="Main">
            <table class="data" id="mainContentTable" cellspacing="0" cellpadding="0" width="100%">
                <tr>
                    <td valign="top"><h1>Truckers Search</h1>Find what you need! This database is easy to use - if
                        you're looking for a specific piece of equipment for hire just use the pull down menu that says
                        "Company Name" and locate the equipment you require, then press return or the filter button. If
                        you're looking for a company to work in a specific county in New Brunswick - just use the pull
                        down menu to identify the county. You can also click on the name of any trucker to bring up
                        their equipment profile and contact information.
                        <hr/>
                        <a href="/search">&lt;&lt; Back to search</a>
                        <hr>
                        <h1>Gary MacBean</h1>
                        <div class="contact">Contact: Gary MacBean</div>
                        <hr>
                        Address1:&nbsp;150 Sunrise Estates Avenue<br>City:&nbsp;New Maryland<br>Province:&nbsp;NB<br>Postal
                        Code:&nbsp;E3C 1G6<br>Phone:&nbsp;1 506 459-3609<br>Cell Phone:&nbsp;1 506 444-1358<br>Fax:&nbsp;1
                        506 459-5154<br>
                        <hr>
                        Number Of Trucks:&nbsp;2<br>Has Dump Trailer:&nbsp;Yes<br>Has Tandem Dump Truck:&nbsp;Yes<br>Has
                        Belly Dump:&nbsp;Yes<br>Has Asphalt Tarp Spreader:&nbsp;Yes<br>
                        <hr>
                        Has Compensation WorkSafeNB:&nbsp;Yes<br>Has Liability Insurance:&nbsp;Yes<br>Has HST Number:&nbsp;Yes<br>
                        <hr>
                        Works Province Wide:&nbsp;Yes<br>
                        <hr>
                        <hr/>
                        <a href="/contact_information">Comment, Questions?</a></td>
                </tr>
            </table>
            <div id="Footer">
                <hr/>
                <p>
                <h2>Serving Central and Southern New Brunswick</h2><br/><br/>Powered by: <a
                    href="http://www.peopleandgroups.com" title="www.peopleandgroups.com">People&Groups</a></p></div>
        </div>
    </div>
</div>
</body>
</html>

思考?我应该使用其他方法从其他网站捕获数据吗?

0 个答案:

没有答案