使用beautifulsoup和请求提取json数据

时间:2016-02-27 23:55:14

标签: python django django-templates beautifulsoup django-views

暂时面对一个奇怪的问题。 我尝试过很多方法,但仍无法解决问题。 我有以下基于类的视图,它有一个get_context_data()并将处理给定城市的天气。从返回的json我需要提取我想要使用的正确信息,但是,我正在努力让它正确完成。 当我使用django-debug-toolbar检查我的模板上下文渲染时,我能够看到来自json的所有数据,但是当我在浏览器中的实际模板中时,我看到了一些奇怪的东西。 这是我的代码:

class FetchWeather(generic.TemplateView):
    template_name = 'forecastApp/pages/weather.html'

    request_post = None

    def get_context_data(self, **kwargs):
        context = super().get_context_data(**kwargs)
        url = 'http://weather.news24.com/sa/cape-town'
        city = 'cape town'
        url_request = requests.get(url)
        soup = BeautifulSoup(url_request.content, 'html.parser')
        city_list = soup.find(id="ctl00_WeatherContentHolder_ddlCity")
        print(soup.head)
        city_as_on_website = city_list.find(text=re.compile(city, re.I)).parent
        cityId = city_as_on_website['value']
        json_url = "http://weather.news24.com/ajaxpro/TwentyFour.Weather.Web.Ajax,App_Code.ashx"

        headers = {
            'Content-Type': 'text/plain; charset=UTF-8',
            'Host': 'weather.news24.com',
            'Origin': 'http://weather.news24.com',
            'Referer': url,
            'User-Agent': 'Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/48.0.2564.82 Chrome/48.0.2564.82 Safari/537.36',
            'X-AjaxPro-Method': 'GetCurrentOne'}

        payload = {
            "cityId": cityId
        }
        request_post = requests.post(json_url, headers=headers, data=json.dumps(payload))
        print(request_post.content)
        data = re.sub(r"new Date\(Date\.UTC\((\d+),(\d+),(\d+),(\d+),(\d+),(\d+),(\d+)\)\)", convert_date, request_post.text)
        data = data.strip(";/*")
        data = json.loads(data)
        context["cityId"] = data
        return context

这些是实际模板的屏幕截图和来自deug-toolbar的分析: normal template view[![debug toolbar] 2

事实上,我所需要的只是在预测中形成这个json:

{
  'CountryName': 'South Africa',
  '__type': 'TwentyFour.Services.Weather.Objects.CurrentOneReport, TwentyFour.Services.Weather, Version=1.2.0.0, Culture=neutral, PublicKeyToken=null',
  'MarineReport': None,
  'TimeZone': '2',
  'Location': {
    '__type': 'TwentyFour.Services.Weather.Objects.Location, TwentyFour.Services.Weather, Version=1.2.0.0, Culture=neutral, PublicKeyToken=null',
    'Forecasts': [
      {
        'DayLight': 'D',
        'WindDirection': '161',
        '__type': 'TwentyFour.Services.Weather.Objects.Forecast, TwentyFour.Services.Weather, Version=1.2.0.0, Culture=neutral, PublicKeyToken=null',
        'SkyDescriptor': '1',
        'Date': '2016-01-27T22:00:00',
        'Rainfall': '*',
        'Icon': '1',
        'WindDirectionDescription': 'South',
        'Visibility': None,
        'TemperatureDescription': 'Mild',
        'HighTemp': '24',
        'TemperatureDescriptor': '8',
        'BeaufortDescriptor': 'Fresh breeze',
        'Cached': False,
        'PrecipitationDescriptor': '',
        'Snowfall': '*',
        'DaySegment': None,
        'ShortWeekDay': 'Sun',
        'DaySequence': 1,
        'WindSpeed': '34',
        'WeekDay': 'Sunday',
        'Sky': 'Sunny',
        'PrecipitationProbability': '0',
        'Precipitation': '',
        'WindDirectionAbreviated': 'S',
        'FormattedDate': 'Sun, Feb 28',
        'Segment': None,
        'Beaufort': '5',
        'Description': 'Sunny. Mild.',
        'IconName': 'sunny',
        'Temperature': None,
        'DewPoint': '14',
        'Air': 'Breezy',
        'Humidity': '55',
        'UV': 'High',
        'Comfort': '25',
        'LowTemp': '18',
        'DayOfWeek': 1,
        'AirDescription': '13'
      }
    ],
    'City': '77107',
    'Cached': False,
    'CityName': 'Cape Town'
  }

是使用beautifulsoup

提取低温,高温和日期

1 个答案:

答案 0 :(得分:1)

在我看来,美丽的汤用于检索相关城市的city_id

一旦检索到json,它将被转换为python对象:

    ...
    data = json.loads(data)
    ...

假设这个工作正常,可以从这个对象中挑选出所需的项目并添加到上下文中:

<强> EDITED

    ....
    forecast = data['Forecast']
    context["LowTemp"] = forecast["LowTemp"]
    context["HighTemp"] = forecast["HighTemp"]
    context["Date"] = forecast["Date"]
    return context