我想使用Python和bs4从抓取的HTML表中删除<img data-interchange="[images/church-small.jpg, small], [images/church-medium.jpg, medium], [images/church-large.jpg, large]" class="home-banner__image" >
和<br>
。
HTML表格:
Python代码:
<tr>
<td style="width: 15; BORDER-BOTTOM: 1px solid">col1</td>
<td colspan="2" style="width: 120; BORDER-BOTTOM: 1px solid"> col2</td>
<td style="width: 50; BORDER-BOTTOM: 1px solid">col3</td>
<td style="width: 50; BORDER-BOTTOM: 1px solid">col5</td>
<td style="width: 50; BORDER-BOTTOM: 1px solid">col6</td>
<td style="width: 90; BORDER-BOTTOM: 1px solid" align="center">col7</td>
<td style="width: 90; BORDER-BOTTOM: 1px solid" align="center">col8</td>
<td style="width: 10; BORDER-BOTTOM: 1px solid">col9</td>
<td style="width: 10; BORDER-BOTTOM: 1px solid">col
<br> 1
<br>0</td>
<td style="width: 10; BORDER-BOTTOM: 1px solid">col11</td>
<td style="width: 10; BORDER-BOTTOM: 1px solid" >col12</td>
<td style="width: 10; BORDER-BOTTOM: 1px solid">col13</td>
<td style="width: 10; BORDER-BOTTOM: 1px solid">col14</td>
<td style="width:10;BORDER-BOTTOM: 1px solid;" >col15</td>
</tr>
<tr bordercolor="#000000" class="rows1">
<td align="left"> 1</td>
<td colspan="2" style="BORDER-LEFT: 1px solid" align="left"> 123456789</td>
<td style="BORDER-LEFT: 1px solid" align="left"> John </td>
<td style="BORDER-LEFT: 1px solid" align="left"> Doe </td>
<td style="BORDER-LEFT: 1px solid" align="left"> </td>
<td style="BORDER-LEFT: 1px solid" align="right"> 3.000</td>
<td style="BORDER-LEFT: 1px solid" align="right"> 0,00</td>
<td style="BORDER-LEFT: 1px solid" align="right"> 30</td>
<td style="BORDER-LEFT: 1px solid" align="right"> 0</td>
<td style="BORDER-LEFT: 1px solid" align="right"> </td>
<td style="BORDER-LEFT: 1px solid" align="right"> </td>
<td style="BORDER-LEFT: 1px solid; BORDER-RIGHT: 1px solid" align="right"> </td>
<td style="BORDER-LEFT: 1px solid; BORDER-RIGHT: 1px solid" align="right"> </td>
<td style="BORDER-LEFT: 1px solid;BORDER-RIGHT: 1px solid;" align="right"> 5000</td>
</tr>
<tr bordercolor="#000000" class="rows0">
<td align="left"> 2</td>
<td colspan="2" style="BORDER-LEFT: 1px solid" align="left"> 123456789</td>
<td style="BORDER-LEFT: 1px solid" align="left"> Jane </td>
<td style="BORDER-LEFT: 1px solid" align="left"> Doe </td>
<td style="BORDER-LEFT: 1px solid" align="left"> </td>
<td style="BORDER-LEFT: 1px solid" align="right"> 3.000</td>
<td style="BORDER-LEFT: 1px solid" align="right"> 0,00</td>
<td style="BORDER-LEFT: 1px solid" align="right"> 30</td>
<td style="BORDER-LEFT: 1px solid" align="right"> 0</td>
<td style="BORDER-LEFT: 1px solid" align="right"> 3</td>
<td style="BORDER-LEFT: 1px solid" align="right"> </td>
<td style="BORDER-LEFT: 1px solid; BORDER-RIGHT: 1px solid" align="right"> </td>
<td style="BORDER-LEFT: 1px solid; BORDER-RIGHT: 1px solid" align="right"> </td>
<td style="BORDER-LEFT: 1px solid;BORDER-RIGHT: 1px solid;" align="right"> 5000</td>
</tr>
答案 0 :(得分:0)
您可以使用replace
函数。
TEXT = '<br>test test'
TEXT = TEXT.replace('<br>', '').replace(' ', '')