当存在字符串时,BeautifulSoup返回[]

时间:2016-12-17 17:16:41

标签: python beautifulsoup

我正在抓取一个网页,除了re.compile()在传递给它的文本存在时返回空[]的部分之外,它工作得很好。这是我的刮刮代码

dob = soup.find(text = re.compile('Date of Birth')).findNext('td').text
print(dob)

father_name = soup.find(text = re.compile("Father's Name")).findNext('td').text
print(father_name)

mob_no_parent = soup.find(text = re.compile("Mobile Number")).findNext('td').text
print(mob_no_parent)

mob_no_student = soup.findAll(text = re.compile("Mobile Number(Student)"))
print(mob_no_student)

email = soup.find(text = re.compile("E - Mail Address")).findNext('td').text
print(email)

p_address = soup.find(text = re.compile("PermanentAddress")).findNext('td').text
print(p_address)

上述代码适用于除

之外的所有文本
mob_no_student = soup.findAll(text = re.compile("Mobile Number(Student)"))
print(mob_no_student)

上面的一个返回[]

这是我的HTML代码

<td align="left" width="50%" class="inner_padding_even">&nbsp;&nbsp;Registration No </td>
<td align="left" width="50%" class="inner_padding_even">CPT0000</td>
</tr>
<tr> 
<td align="left" width="50%" class="inner_padding_odd">&nbsp;&nbsp;Name of Candidate</td>
<td align="left" width="50%" class="inner_padding_odd"><font face=arial size=2>KKKKKKK B.</font></td>
</tr>
<tr>
<td align="left" class="inner_padding_even">&nbsp;&nbsp;Date of Birth</td>
<td align="left" class="inner_padding_even">16.11.1900</td>
</tr>
<tr>
<td align="left" class="inner_padding_even">&nbsp;&nbsp;Father's Name</td>
<td align="left" class="inner_padding_even">BBBBBBBB.</td>
</tr>
<tr>
<td align="left" class="inner_padding_even">&nbsp;&nbsp;Mobile    Number</font>(Parent)</td>
<td align="left" class="inner_padding_even">99999999999</td>
</tr>
<tr>
<td align="left" class="inner_padding_odd">&nbsp;&nbsp;Mobile Number(Student)</td>
<td align="left" class="inner_padding_odd">9999999999</td>

</tr>
<tr>
<td align="left" class="inner_padding_even">&nbsp;&nbsp;E - Mail Address</td>
<td align="left" class="inner_padding_even">keyansgm@gmail.com</td> 
</tr>
<tr>
<td width="50%" align="left" class="inner_padding_even">&nbsp;Permanent Address</td>
<td width="50%" align="left" class="inner_padding_even">Blah blah</td>
</tr>

我在这里缺少什么?

1 个答案:

答案 0 :(得分:2)

在正则表达式中,您需要转义括号,否则它将引用一个组

试试这个

mob_no_student = soup.findAll(text = re.compile("Mobile Number\(Student\)"))