美好的一天,我在从以下网站获取数据时遇到问题:
http://weather.news24.com/sa/johannesburg
我尝试过使用python请求和urllib,但没有成功。通过使用chrome developertools检查页面资源,我发现以下url包含所需数据,但我仍然没有将数据作为json,因为我想得到低温和高温,日出,日落。
在我看来,有一个加载数据的ajax函数。 我试过这两个,所以我以后可以在django中使用它们。我正在使用python 3。 任何帮助将不胜感激。
答案 0 :(得分:1)
希望这会有所帮助:
import requests,re,json
from bs4 import BeautifulSoup
# This is your main url
main_url="http://weather.news24.com/sa/johannesburg"
# I am extracting city name from url. Not sure if you already have that somewhere
mycity=main_url.split('/')[-1]
# Calling your main_url
r=requests.get(main_url)
# Now The only valuable info you get on this request is the CityId for Johannesburg
# So lets grab it using BeautifulSoup
soup=BeautifulSoup(r.content)
# This gives me the list of all the cities on website and thier CityId
city_list=soup.find(id="ctl00_WeatherContentHolder_ddlCity")
# I am looking for city (johannesburg) within the city_list
# re.I in the code below is to ignoreCASE
city_as_on_website=city_list.find(text=re.compile(mycity,re.I)).parent
cityId=city_as_on_website['value']
# Now make a POST request to following url with following headers and data to get the JSON
json_url="http://weather.news24.com/ajaxpro/TwentyFour.Weather.Web.Ajax,App_Code.ashx"
headers={'Content-Type':'text/plain; charset=UTF-8',
'Host':'weather.news24.com',
'Origin':'http://weather.news24.com',
'Referer':main_url,
'User-Agent':'Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/48.0.2564.82 Chrome/48.0.2564.82 Safari/537.36',
'X-AjaxPro-Method':'GetCurrentOne'}
payload={"cityId": cityId} # This is the cityId that we found above using BeautifulSoup
# Now send the POST request
r=requests.post(json_url,headers=headers,data=json.dumps(payload))
# r.content will sure give you the json data that you expect.
# However, the sad thing is that this one is not well formatted.
# And solving that will be completely another question on Stackoverflow
# Hope, you will fight your way to it.
# Good Luck! :-)
Out[1]: '{"__type":"TwentyFour.Services.Weather.Objects.CurrentOneReport, TwentyFour.Services.Weather, Version=1.2.0.0, Culture=neutral, PublicKeyToken=null","Observations":[{"__type":"TwentyFour.Services.Weather.Objects.Observation, TwentyFour.Services.Weather, Version=1.2.0.0, Culture=neutral, PublicKeyToken=null","CityName":"Lanseria Civ / Mil","Location":"Lanseria Civ / Mil","Sky":"Passing clouds","Temperature":"25.00","Humidity":"54","WindSpeed":"15","WindDirectionAbreviated":"SE","Comfort":"26","DewPoint":"15","Description":"Passing clouds. Warm.","Icon":"2","IconName"
...
...
":null,"Rainfall":"14mm","Snowfall":"*","PrecipitationProbability":"52","Icon":"22","IconName":"tstorms","Cached":false},"AstronomyReport":null,"MarineReport":null,"LocalTime":"Wed, 24 Feb 2016 17:30:27 SAST","LocalUpdateTime":"Wed, 24 Feb 2016 17:12:07 SAST","CountryName":"South Africa","TimeZone":"2","Cached":false};/*'