无法在<a> tag in lxml

时间:2017-04-20 17:10:17

标签: html xpath web-scraping lxml lxml.html

I am using lxml to scrape data from a website. The html code snippet is

<span class="pro-contact-text">
<a class="click-to-call-link text-gray-light trackMe" href="javascript:;" 
   objId="104809" compid="clickToCall_profile_organic" phone="(617) 505-4149"">Click to Call</a>
</span>

I can get to the span by using the xpath (Eg. //*(some tags)/span[@class="pro-contact-text"] ) and when I print the varible it prints a valid element (Eg. <Element span at 0x3589510> ) When i extend the xpath to { span[@class="pro-contact-text"]/a/@phone } it returns an empty list. Can someone help me to do this.

1 个答案:

答案 0 :(得分:0)

问题在于无效的html。

属性phone""结尾(两个引号)。

phone="(617) 505-4149"">
                      ^