Question

我正在尝试获取该网站最后一页的页码 http://digitalmoneytimes.com/category/crypto-news/

此链接显示最后一个页码是335，但我无法提取该页码。

soup = BeautifulSoup(page.content, 'html.parser')
soup_output= soup.find_all("li",{"class":"active"})
soup_output=soup.select(tag)
print(soup_output)

我得到一个空列表作为输出

Answer 1

为了获得给定网站的最后一页，我强烈建议您使用以下代码：

import requests 
from bs4 import BeautifulSoup

page = requests.get("http://digitalmoneytimes.com/category/crypto-news/")
soup = BeautifulSoup(page.content, 'html.parser')
soup = soup.find_all("a", href = True)
pages = []
for x in soup:
    if "http://digitalmoneytimes.com/category/crypto-news/page/" in str(x):
        pages.append(x)
last_page = pages[2].getText()

其中last_page等于最后一页。由于我无权访问您的tag和page变量，因此我无法真正告诉您代码中的问题出在哪里。

真的希望能解决您的问题。

Answer 2

如果要获取最后一个页码，那么您也可以尝试以下方法：

   $(document).ready(function(){
       $('[data-toggle="tooltip"]').click(function () {
          $('[data-toggle="tooltip"]').tooltip("hide");

       });
   });

输出：

import requests
from bs4 import BeautifulSoup

link = 'http://digitalmoneytimes.com/category/crypto-news/'

res = requests.get(url)
soup = BeautifulSoup(res.text,"lxml")
last_page_num = soup.find(class_="pagination-next").find_previous_sibling().text
print(last_page_num)

获取摆页的最后一页编号-美丽的汤

2 个答案: