使用Beautiful Soup 4

时间:2018-02-26 07:38:17

标签: python html css web-scraping beautifulsoup

我是python和web抓取的新手,需要一些帮助。我试图废弃一个嵌套块的网页部分。我想将所有这些块导出到csv。我写了以下代码:

from bs4 import BeautifulSoup
import urllib2
patent = "US20170293299"
urla = "https://patents.google.com/patent/"+patent
list1 = []

page = urllib2.urlopen(urla)

soup = BeautifulSoup(page, 'html.parser')
list1 = soup.find_all('div', class_ = 'claim-text')

for x in range(len(list1)):
    print list1[x]
    print '\n'

这是我在Pycharm

中运行代码后得到的输出
<div class="claim-text"> <b>1</b>. An automated driving system of a vehicle comprising: <div class="claim-text">a surrounding environment information acquiring device acquiring surrounding environment information relating to surrounding environment conditions of the vehicle;</div> <div class="claim-text">a vehicle information acquiring device acquiring vehicle information relating to conditions of the vehicle;</div> <div class="claim-text">a driver information acquiring device acquiring driver information relating to conditions of a driver of the vehicle;</div> <div class="claim-text">an automated driving executing part executing automated driving of the vehicle based on a driving assistance package packaging permissions for a plurality of driving assistance operations;</div> <div class="claim-text">a package determining part determining a driving assistance package to be proposed to the driver based on at least one of the surrounding environment information, the vehicle information, and the driver information,</div> <div class="claim-text">a package proposing part proposing the driving assistance package determined by the package determining part to the driver; and</div> <div class="claim-text">an emergency condition judging part judging if the driver is in an emergency condition based on the driver information, wherein</div> <div class="claim-text">the automated driving executing part performs automated driving of the vehicle based on an emergency driving assistance package packaging permissions of the plurality of driving assistance operations when the driver is in an emergency condition, if the emergency condition judging part judges that the driver is in an emergency condition, and performs automated driving of the vehicle based on the driving assistance package proposed by the packaging proposing part and approved by the driver, if the emergency condition judging part judges that the driver is not in an emergency condition.</div> </div>


<div class="claim-text">a surrounding environment information acquiring device acquiring surrounding environment information relating to surrounding environment conditions of the vehicle;</div>


<div class="claim-text">a vehicle information acquiring device acquiring vehicle information relating to conditions of the vehicle;</div>


<div class="claim-text">a driver information acquiring device acquiring driver information relating to conditions of a driver of the vehicle;</div>


<div class="claim-text">an automated driving executing part executing automated driving of the vehicle based on a driving assistance package packaging permissions for a plurality of driving assistance operations;</div>


<div class="claim-text">a package determining part determining a driving assistance package to be proposed to the driver based on at least one of the surrounding environment information, the vehicle information, and the driver information,</div>


<div class="claim-text">a package proposing part proposing the driving assistance package determined by the package determining part to the driver; and</div>


<div class="claim-text">an emergency condition judging part judging if the driver is in an emergency condition based on the driver information, wherein</div>


<div class="claim-text">the automated driving executing part performs automated driving of the vehicle based on an emergency driving assistance package packaging permissions of the plurality of driving assistance operations when the driver is in an emergency condition, if the emergency condition judging part judges that the driver is in an emergency condition, and performs automated driving of the vehicle based on the driving assistance package proposed by the packaging proposing part and approved by the driver, if the emergency condition judging part judges that the driver is not in an emergency condition.</div>


<div class="claim-text"> <b>2</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00001">claim 1</claim-ref>, wherein <div class="claim-text">the package determining part determines a driving assistance package to be proposed to the driver based on the driver information, and</div> <div class="claim-text">the emergency condition judging part judges that the driver is in an emergency condition when the driving assistance package determined by the package determining part is the emergency driving assistance package.</div> </div>


<div class="claim-text">the package determining part determines a driving assistance package to be proposed to the driver based on the driver information, and</div>


<div class="claim-text">the emergency condition judging part judges that the driver is in an emergency condition when the driving assistance package determined by the package determining part is the emergency driving assistance package.</div>


<div class="claim-text"> <b>3</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00001">claim 1</claim-ref>, wherein <div class="claim-text">the system further comprises an alarm part issuing a warning to the driver, and</div> <div class="claim-text">the automated driving executing part performs automated driving of the vehicle based on the emergency driving assistance package if the emergency condition judging part judges that the driver is in an emergency condition after the warning by the alarm part.</div> </div>


<div class="claim-text">the system further comprises an alarm part issuing a warning to the driver, and</div>


<div class="claim-text">the automated driving executing part performs automated driving of the vehicle based on the emergency driving assistance package if the emergency condition judging part judges that the driver is in an emergency condition after the warning by the alarm part.</div>


<div class="claim-text"> <b>4</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00002">claim 2</claim-ref>, wherein <div class="claim-text">the system further comprises an alarm part issuing a warning to the driver, and</div> <div class="claim-text">the automated driving executing part performs automated driving of the vehicle based on the emergency driving assistance package if the emergency condition judging part judges that the driver is in an emergency condition after the warning by the alarm part.</div> </div>


<div class="claim-text">the system further comprises an alarm part issuing a warning to the driver, and</div>


<div class="claim-text">the automated driving executing part performs automated driving of the vehicle based on the emergency driving assistance package if the emergency condition judging part judges that the driver is in an emergency condition after the warning by the alarm part.</div>


<div class="claim-text"> <b>5</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00003">claim 3</claim-ref>, wherein the alarm part issues the warning to the driver by sound.</div>


<div class="claim-text"> <b>6</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00004">claim 4</claim-ref>, wherein the alarm part issues the warning to the driver by sound.</div>


<div class="claim-text"> <b>7</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00003">claim 3</claim-ref>, wherein the alarm part issues the warning to the driver by changing the behavior of the vehicle.</div>


<div class="claim-text"> <b>8</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00004">claim 4</claim-ref>, wherein the alarm part issues the warning to the driver by changing the behavior of the vehicle.</div>


<div class="claim-text"> <b>9</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00005">claim 5</claim-ref>, wherein the alarm part issues the warning to the driver by changing the behavior of the vehicle.</div>


<div class="claim-text"> <b>10</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00006">claim 6</claim-ref>, wherein the alarm part issues the warning to the driver by changing the behavior of the vehicle.</div>


<div class="claim-text"> <b>11</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00001">claim 1</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>


<div class="claim-text"> <b>12</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00002">claim 2</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>


<div class="claim-text"> <b>13</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00003">claim 3</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>


<div class="claim-text"> <b>14</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00004">claim 4</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>


<div class="claim-text"> <b>15</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00005">claim 5</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>


<div class="claim-text"> <b>16</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00006">claim 6</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>


<div class="claim-text"> <b>17</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00007">claim 7</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>


<div class="claim-text"> <b>18</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00008">claim 8</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>


<div class="claim-text"> <b>19</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00009">claim 9</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>


<div class="claim-text"> <b>20</b>. The vehicle automated driving system according to <claim-ref idref="CLM-00010">claim 10</claim-ref>, wherein in the emergency driving assistance package, an auto stop control of the vehicle and a hazard light turn on control of the vehicle are permitted.</div>



Process finished with exit code 0

问题是,每当一组文本开始时,整个块都被读取为一个元素,并且嵌套在该大块内的块将被单独读取。我不希望整个块被读作一个大元素。

<div class="claim-text">
      <b>
       1
      </b>
      . An automated driving system of a vehicle comprising:
      <div class="claim-text">
       a surrounding environment information acquiring device acquiring surrounding environment information relating to surrounding environment conditions of the vehicle;
      </div>

看看上面的HTML。我希望第一行包含
&#34; 1。一种车辆的自动驾驶系统,包括:&#34;
第二行包含
&#34;周围环境信息获取设备获取与车辆周围环境条件有关的周围环境信息;&#34;
......等等。

但问题是,第一行之后的数据不包含标签。所以整个第一段出现在第一个列表元素中。

以下是我想做的要点:
我想提取所有 <div class = "claim-text">标记开头,结尾</div>标记或新<div class = "claim-text">的文字标签。 任何帮助,将不胜感激。 提前谢谢。

0 个答案:

没有答案