BeautifulSoup 4正在插入关闭标签

时间:2014-01-16 21:57:42

标签: python html python-2.7 beautifulsoup

我从yahoo拉取html代码,BS4(4.3.2)在“eps估计”之后插入结束标记,包括结束标记。这导致它无法解析表中的相关信息

原始摘录:

<b>Earnings
Announcements for
Wednesday, January 15</b></td></tr><tr
bgcolor=dcdcdc><td><font
face=arial
size=-1><b>Company</b></font></td><td><font
face=arial
size=-1><b>Symbol</b></font></td><td
align=center><font
face=arial
size=-1><b>EPS<br>Estimate*</font></b></td><td
align=center><font
face=arial
size=-1><b>Time</b></font></td><td
align=center><font
face=arial
size=-1><b>Add
to
My<br>Calendar</b></font></td><td
align=center><font`...

After BeautifulSoup(html):

 <td align="center">
          <font face="arial" size="-1">
           <b>
            EPS
            <br>
             Estimate*
            </br>
           </b>
          </font>
         </td>
        </tr>
       </table>
      </td>
     </tr>
    </table>
   </p>
  </p>
  </p>
 </br>
</br>
</link>
</body>
</html>
<td align="center"><font face="arial" size="-1">
 <b>
Time
</b>
</font>
</td>

...

0 个答案:

没有答案