使用 BS4+request 的网页抓取代码不刷新

时间:2021-07-19 06:23:44

标签: python web-scraping beautifulsoup python-requests

我有一个抓取天气网站的代码的问题。它应该每小时更新一次,但由于某种原因,给出的数据不是网站上的当前数据;它也不会更新其数据,而是不断地提供相同的数据。请帮忙!!!

另外,我需要帮助从网站上抓取天气图标。

这是我的代码:

from bs4 import BeautifulSoup
from plyer import notification
import requests
import time

if __name__ == '__main__':

    while True:
        def notifyMe(title, message):
            notification.notify(
                title = title,
                message = message,
                #app_icon = icon,
                timeout = 7
            )

        try:
            # site = requests.get('https://weather.com/weather/today/l/5.02,7.97?par=google')
            site = requests.get('https://weather.com/en-NG/weather/today/l/4dce0117809bca3e9ecdaa65fb45961a9718d6829adeb72b6a670240e10bd8c9')
            # site = requests.get('http://localhost/weather.com/weather/today/l/5.02,7.97.html')
            soup = BeautifulSoup(site.content, 'html.parser')
            day = soup.find(class_= 'CurrentConditions--CurrentConditions--14ztG')

            location = day.find(class_='CurrentConditions--location--2_osB').get_text()
            timestamp = day.find(class_='CurrentConditions--timestamp--3_-CV').get_text()
            tempValue = day.find(class_='CurrentConditions--tempValue--1RYJJ').get_text()
            phraseValue = day.find(class_='CurrentConditions--phraseValue--17s79').get_text()
            precipValue = day.find(class_='CurrentConditions--precipValue--1RgXi').get_text()
            #icon = day.find(id ='svg-symbol-cloud').get_icon()

            weather = timestamp + "\n" + tempValue + " " + phraseValue + "\n" + precipValue
        except requests.exceptions.ConnectionError:
            location = "Couldn't get a location."
            weather = "Error connecting to website."
        except AttributeError:
            weather = timestamp + "\n" + tempValue + " " + phraseValue
   
        # print (weather)

        notifyMe( location, weather )
        time.sleep(30)

预期输出:

Uyo, Akwa Ibom 天气 截至 13:28 瓦特 30° 多云 14:00 前有 55% 的几率下雨

2 个答案:

答案 0 :(得分:1)

import requests
from bs4 import BeautifulSoup


def main(url):
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'lxml')

    x = list(soup.select_one('.card').stripped_strings)
    del x[4:8]
    print(x)


main('https://weather.com/en-NG/weather/today/l/4dce0117809bca3e9ecdaa65fb45961a9718d6829adeb72b6a670240e10bd8c9')

输出:

['Uyo, Akwa Ibom Weather', 'As of 8:03 WAT', '24°', 'Cloudy', '47% chance of rain until 9:00']

答案 1 :(得分:0)

看来错误可能来自该站点,因为它现在可以正常工作,没有出现问题。谢谢大家的建议。 @Ahmed American 你的代码很漂亮。我从中吸取了教训。 @furas 我会按照你的建议尝试构建 SVG。

That's the output.