请求响应的Xpath返回空列表

时间:2016-07-19 13:58:01

标签: python xpath web-scraping

我正在尝试学习网页抓取。 我需要从此页面获取所有网址 - http://www.99acres.com/rent-property-in-chennai-ffid

首先,我需要先对最新的条目进行排序,然后在我的代码中复制getresults_ajax POST请求。即使Chrome控制台中的xpath返回有效结果,我的代码中也会出现一个空列表。

我知道复制请求可能很乏味,我使用Selenium和PhantomJS来抓取动态页面,但是我需要对内容进行排序,然后从响应中获取数据,这似乎很棘手。

我的代码:

d = {
    'src': 'SORTING_date_d',
    'static_search': 'true',    
    '': 'undefined',
    'sortby': 'date_d',
    'lstAcnId': '8930791340597402',
    'encrypted_input': 'UiB8IFFTIHwgUiB8IzIjICB8IGNoZW5uYWkgIzMjfCAgfCBDUDMyIzIyIyB8IDI1MTU3NTg2IHwgIHwgMzIgfCM1IyAgfCBSICM0MCN8ICA=',
    'lstAcn': 'SEARCH',
    'is_ajax': '1'
}

h = {
    'Referrer': 'http://www.99acres.com/rent-property-in-chennai-ffid?orig_property_type=R&search_type=QS&search_location=CP32&pageid=QS&keyword_orig=chennai'
}

req = requests.post(url = 'http://www.99acres.com/do/quicksearch/getresults_ajax', data = d, headers = h)
r = html.fromstring(req.text)

#print('test 1' + str(req.text))

prices = r.xpath('//div[@title = "View property details"]')

print('test %d' % len(prices))
# driver = webdriver.PhantomJS(executable_path = R'C:\Python27\selenium\webdriver\phantomjs-2.1.1-windows\bin\phantomjs.exe')

for price in prices:
    print('price is this ' + str(price))

1 个答案:

答案 0 :(得分:1)

如果你打印文本,你会发现它是一个json响应:

{"html_ysf":"    <div class=\"srp-ysfWrap boxSize\">\n\n\n\n        <diV. etc.............

所以要获得你想要的东西,只需使用 html2键提取有趣的html:

req = requests.post(url='http://www.99acres.com/do/quicksearch/getresults_ajax', data=d, headers=h)
r = html.fromstring((req.json()["html2"]))
prices = r.xpath('//div[@title = "View property details"]')
print('test %d' % len(prices))
for price in prices:

        print('price is this ' + str(price))

每个价格都是div元素,所以如果我们运行:

    for price in prices:
        print(html.tostring(price))

我们得到如下输出:

b'<div data-propid="Q26021619" data-pgid="QS" class="srpWrap " title="View property details" data-fsl="N">\n\t\t<input id="ajxPDFlg" type="hidden" value="najx">\n        <input id="dataSRPCLKTRK" type="hidden" value="ON">\n        <i class="uiIcon pLatinum"></i>\t\t<div class="wrapttl">\n\t\t\t<div class="_srpttl srpttl  fwn wdthFix480 lf">\n                <b class="WebRupee f14 mr5"> &#8377;</b>                <b id="rs_Q26021619">18,000</b>\n                <a data-proppos="\'\'" id="desc_Q26021619" class="b wWrap" target="_blank" title="2 BHK,  Residential Apartment for rent in Choolaimedu" href="/2-bhk-bedroom-apartment-flat-for-rent-in-choolaimedu-chennai-central-1000-sq-ft-spid-Q26021619" data-fsl="N">2 BHK,  Residential Apartment for rent in Choolaimedu</a>            </div>\n            <i class="uline" data-maplatlngzm="13.06709,80.2195432,11" data-iwdesc="  Residential Apartment for rent in Choolaimedu" data-ttlurl="http://www.99acres.com/2-bhk-bedroom-apartment-flat-for-rent-in-choolaimedu-chennai-central-1000-sq-ft-spid-Q26021619" data-price="18,000," data-area="Super built-up ,1000,Sq.Ft." data-bedrm="2" data-bldname="On Request" title="View Map"><i class="uiIcon imap"></i><i class="ml_5 f13 vmid hverU">Map</i></i>            <div class="clr"></div>\n\t\t</div>\n        \n                \n\t\t<div class="srpDetail">\n\t\t\t<div class="srpImg rel">\n                <img class="imgBoxSrp lazy" alt="2 BHK,  Residential Apartment for rent in Choolaimedu" width="208" height="150" data-original="http://static.99acres.com/images/srpimages/noproperty-new.png" src="http://static.99acres.com/images/i0.gif"><div class="imgCap" data-clk-json=\'{"sno":-1,"ids":"0;732;","phType":"PROP","index":0,"text":"Sri Sakthi Real Estate","classLabel":"Dealer","profileId":"1122559","bedroomNum":"2","src":"SRP"}\'><a class="trackVamRos" vamacttype="Locality_Video_Count" vamactsrc="RENT_SRP" data-trkctgry="CLICK_LOCALITY_VIDEO_LINK" data-blid="732" href="#" data-clk-json=\'{"vtag":"LOC","sno":-1,"tab":4,"ids":"0;732;","phType":"PROP","entity":"locimages","subtab":"LVIDEO","text":"Sri Sakthi Real Estate","classLabel":"Dealer","profileId":"1122559","bedroomNum":"2","src":"SRP"}\'>1 Locality Video</a><div class="clr"></div></div>\t\t\t</div>\n\t\t\t<div class="srpDataWrap"><span>Super built-up  Area : <b>1000 Sq.Ft. </b></span><div class="clr pdt8"></div><span class="doElip">Society : <bclass>On Request</bclass></span><div class="sep clr mt3imp"></div><span><span>Highlights:&#160; </span> <span>On Rent&#160;</span><span> <span>/&#160;</span> 1 to 5 years old&#160;</span><span> <span>/&#160;</span> Unfurnished&#160;</span><span> <span>/&#160;</span> 2nd  Floor (out of 3)&#160;</span></span><div class="sep clr"></div>\t\t\t\t<div class="lf  f12 wBr">\n\t\t\t\t\t<b>Description :</b> \n                    Near gandhi road\nGood locality, Calm atmosphere\nCall for more details\t\t\t\t</div>\n                                                                    <div class="rel clr">\n                        <div class="lf mt13 mr13">Features: </div>\n                        <div class="iconDiv fc_icons fcInit" attr="4,5,24,">\n                        <i class="i4" value="Reserved Parking">&#160;</i><i class="i5" value="Feng Shui / Vaastu Compliant">&#160;</i><i class="i24" value="Water Storage">&#160;</i>                        </div>\n                         \n                        <div class="LyrIcon clkEvntStp top0imp"></div>\n                </div>\n                    \t\t\t</div>\n            <div class="clr p5"></div>\n            <div class="lf f13 hm10 mb5">Dealer : <a data-pid="1122559" class="hverU blkImp srpTplTrck" title="Sri Sakthi Real Estate , Chennai Central" target="_blank" href="/sri-sakthi-real-estate-chennai-central-drid-1122559">Sri Sakthi Real Estate</a>                            &#160;&#160;&#160;&#160;Posted : Today                       \n                    </div> \n                \t\t</div>\n        <div class="clr"></div>\n            <div data-srptrk="ntrck" class="srpAction m10 mt5">\n        \t\t<a data-mxid="" data-apid="1122559" data-mc="N" data-rc="R" data-cl="Dealer" data-pgid="QS" href="javascript:void(0);" class="srpBlue f13 mr10 lf cntClk" title="Send E-mail &amp; SMS"> Contact Dealer <i>FREE</i></a><a data-pgid="QS" data-src="listing rank" data-lst="P" data-sms="RGVhciBBRERfQlVZRVJOQU1FX0hFUkUsIHlvdSBtYXkgY29udGFjdCBCYWJ1IGF0ICs5MS05Nzg5MDc0NzQxIGZvciBJTlIgMTggSyAxMDAwIFNxLiBGdC4gRmxhdCBpbiBDaG9vbGFpbWVkdS4=" data-trksrc="listing rank" data-ttc="" href="javascript:void(0);" class="srpWhite f13 mr10 lf vpn" id="viewphnoQ26021619" title="View Phone Number">View Phone Number</a><div data-src="listing rank" id="prop_Q26021619" class="sl_container blkImp f15 lf mt5 mr10"><span class="sl_star_empty_container" title="Shortlist this property"><i class="lf uiIcon sl_star_empty"></i><span class="lf m5">Shortlist</span></span></div>\t    <div class="lf mt5 rptLtng" data-cl="A" data-md="R" data-pid="1122559" data-proptype="1" data-photocount="0" data-rescom="R">\n\t\t<div class="row dwnSrp"> \n\t\t<i class="spdpIcn repot_acu"></i> \n \t\t<a class="f13 b delCh blLink">Report problem with listing</a>\n\t    </div>\n\t    </div>\n                        </div>\n                <div class="abs verifyLbl ViconPosSrp">\n            <div id="tooltipSociety" class="infoTip2 fwn f13 ital r5 hide VlyrPosSrp">\n                Learn about our verification process <a id="verify_process_info" class="blLink uLine" href="javascript:void(0)" style="text-decoration:underline">here</a>.\n                  <i class="ver-arrow-down abs" style="left: 80px; bottom: -12px;"></i>\n            </div>\n            <i class="uiIcon verified mt8"></i>\n        </div>\n        \t\t<div class="clr pdt10"></div>\n    </div>        \n\n'
b'<div data-propid="X22163381" data-pgid="QS" class="srpWrap " title="View property details" data-fsl="N">\n\t\t<input id="ajxPDFlg" type="hidden" value="najx">\n        <input id="dataSRPCLKTRK" type="hidden" value="ON">\n        <i class="uiIcon pLatinum"></i>\t\t<div class="wrapttl">\n\t\t\t<div class="_srpttl srpttl  fwn wdthFix480 lf">\n                <b class="WebRupee f14 mr5"> &#8377;</b>                <b id="rs_X22163381">22,000</b>\n                <a data-proppos="\'\'" id="desc_X22163381" class="b wWrap" target="_blank" title="2 BHK,  Residential Apartment for rent in Choolaimedu" href="/2-bhk-bedroom-apartment-flat-for-rent-in-choolaimedu-chennai-central-1000-sq-ft-r2-spid-X22163381" data-fsl="N">2 BHK,  Residential Apartment for rent in Choolaimedu</a>            </div>\n            <i class="uline" data-maplatlngzm="13.0673818,80.2213615,11" data-iwdesc="  Residential Apartment for rent in Choolaimedu" data-ttlurl="http://www.99acres.com/2-bhk-bedroom-apartment-flat-for-rent-in-choolaimedu-chennai-central-1000-sq-ft-r2-spid-X22163381" data-price="22,000, @ &lt;span class=WebRupee&gt;&#8377; &lt;/span&gt;22/ Sq.Ft." data-area="Built-up ,1000,Sq.Ft." data-bedrm="2" data-bldname="On Request" title="View Map"><i class="uiIcon imap"></i><i class="ml_5 f13 vmid hverU">Map</i></i>            <div class="clr"></div>\n\t\t</div>\n        \n                \n\t\t<div class="srpDetail">\n\t\t\t<div class="srpImg rel">\n                <img class="imgBoxSrp lazy" alt="2 BHK,  Residential Apartment for rent in Choolaimedu" width="208" height="150" data-original="http://static.99acres.com/images/srpimages/noproperty-new.png" src="http://static.99acres.com/images/i0.gif"><div class="imgCap" data-clk-json=\'{"sno":-1,"ids":"0;732;","phType":"PROP","index":0,"text":"Sri Sakthi Real Estate","classLabel":"Dealer","profileId":"1122559","bedroomNum":"2","src":"SRP"}\'><a class="trackVamRos" vamacttype="Locality_Video_Count" vamactsrc="RENT_SRP" data-trkctgry="CLICK_LOCALITY_VIDEO_LINK" data-blid="732" href="#" data-clk-json=\'{"vtag":"LOC","sno":-1,"tab":4,"ids":"0;732;","phType":"PROP","entity":"locimages","subtab":"LVIDEO","text":"Sri Sakthi Real Estate","classLabel":"Dealer","profileId":"1122559","bedroomNum":"2","src":"SRP"}\'>1 Locality Video</a><div class="clr"></div></div>\t\t\t</div>\n\t\t\t<div class="srpDataWrap"><span>Built-up  Area : <b>1000 Sq.Ft. </b></span><div class="clr pdt8"></div><span class="doElip">Society : <bclass>On Request</bclass></span><div class="sep clr mt3imp"></div><span><span>Highlights:&#160; </span> <span>On Rent&#160;</span><span> <span>/&#160;</span> 1 to 5 years old&#160;</span><span> <span>/&#160;</span> Furnished&#160;</span><span> <span>/&#160;</span> 1st  Floor (out of 4)&#160;</span></span><div class="sep clr"></div>\t\t\t\t<div class="lf  f12 wBr">\n\t\t\t\t\t<b>Description :</b> \n                    2bhk house on rent in choolaimedu , Gill nagar area with all nessesary facilties.\t\t\t\t</div>\n                                                    \t\t\t</div>\n            <div class="clr p5"></div>\n            <div class="lf f13 hm10 mb5">Dealer : <a data-pid="1122559" class="hverU blkImp srpTplTrck" title="Sri Sakthi Real Estate , Chennai Central" target="_blank" href="/sri-sakthi-real-estate-chennai-central-drid-1122559">Sri Sakthi Real Estate</a>                            &#160;&#160;&#160;&#160;Posted : Today                       \n                    </div> \n                \t\t</div>\n        <div class="clr"></div>\n            <div data-srptrk="ntrck" class="srpAction m10 mt5">\n        \t\t<a data-mxid="" data-apid="1122559" data-mc="N" data-rc="R" data-cl="Dealer" data-pgid="QS" href="javascript:void(0);" class="srpBlue f13 mr10 lf cntClk" title="Send E-mail &amp; SMS"> Contact Dealer <i>FREE</i></a><a data-pgid="QS" data-src="listing rank" data-lst="P" data-sms="RGVhciBBRERfQlVZRVJOQU1FX0hFUkUsIHlvdSBtYXkgY29udGFjdCBCYWJ1IGF0ICs5MS05Nzg5MDc0NzQxIGZvciBJTlIgMjIgSyAxMDAwIFNxLiBGdC4gRmxhdCBpbiBDaG9vbGFpbWVkdS4=" data-trksrc="listing rank" data-ttc="" href="javascript:void(0);" class="srpWhite f13 mr10 lf vpn" id="viewphnoX22163381" title="View Phone Number">View Phone Number</a><div data-src="listing rank" id="prop_X22163381" class="sl_container blkImp f15 lf mt5 mr10"><span class="sl_star_empty_container" title="Shortlist this property"><i class="lf uiIcon sl_star_empty"></i><span class="lf m5">Shortlist</span></span></div>\t    <div class="lf mt5 rptLtng" data-cl="A" data-md="R" data-pid="1122559" data-proptype="1" data-photocount="0" data-rescom="R">\n\t\t<div class="row dwnSrp"> \n\t\t<i class="spdpIcn repot_acu"></i> \n \t\t<a class="f13 b delCh blLink">Report problem with listing</a>\n\t    </div>\n\t    </div>\n                        </div>\n                <div class="abs verifyLbl ViconPosSrp">\n            <div id="tooltipSociety" class="infoTip2 fwn f13 ital r5 hide VlyrPosSrp">\n                Learn about our verification process <a id="verify_process_info" class="blLink uLine" href="javascript:void(0)" style="text-decoration:underline">here</a>.\n                  <i class="ver-arrow-down abs" style="left: 80px; bottom: -12px;"></i>\n            </div>\n            <i class="uiIcon verified mt8"></i>\n        </div>\n        \t\t<div class="clr pdt10"></div>\n    </div>        \n\n'

所以无论你想要什么,都需要从元素中提取出来。