循环页面并抓取 Python 中的内容

时间:2021-03-11 07:43:47

标签: python-3.x web-scraping beautifulsoup python-requests web-crawler

我想从 this link 抓取内容:

enter image description here

enter image description here

如何循环所有页面并抓取红色圆圈中的所有元素?谢谢。

代码:

from bs4 import BeautifulSoup
import requests
import os
from urllib.parse import urlparse

url = 'http://www.eoechina.com.cn/cn2019/gonggaoxinxi.html?classID=1'
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
print(soup)

1 个答案:

答案 0 :(得分:1)

您可以查询一个端点来遍历页面。

方法如下:

<!DOCTYPE html>
<html lang="en">

<head>
  <meta charset="UTF-8" />
  <title>Simple example</title>
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
</head>

<body>
  <div id="div1" onclick="showTitle('div1')" style="border: 1px solid black; cursor:default">div1</div>
  <div id="div2" onclick="showTitle('div2')" style="border: 1px solid black; cursor:default">div2</div>
  <div id="div3" onclick="showTitle('div3')" style="border: 1px solid black; cursor:default">div3</div>
  <div>...</div>
  <div id="divN" onclick="showTitle('divN')" style="border: 1px solid black; cursor:default">divN</div>
  <script>
    function showTitle(id) {
      $('#' + id).attr('title', 'get some info from the server for "' + id + '" on clicking on it');
      //$('#'+id).trigger('focus'); // not working
      //$('#'+id).hover(); // not working
      //$('#'+id).mouseover(); // not working
      //$('#'+id).trigger('mousemove'); // not working
    }
  </script>
</body>

</html>

示例输出:(这是一个屏幕截图,因为 SO 认为输出是垃圾邮件...)

enter image description here

相关问题