美好的一天。我试图从json中提取值时遇到问题。 首先,我的beautifulsoup在贝壳中非常精细,但在django中没有。我试图实现的是从收到的json中提取数据,但没有成功。在我看来,这是我们的课程:
br1
在json中,有一个阵列" Observations"从中我试图获得城市名称,温度高低。
但是当我尝试这样做时:
class FetchWeather(generic.TemplateView):
template_name = 'forecastApp/pages/weather.html'
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
url = 'http://weather.news24.com/sa/cape-town'
city = 'cape town'
url_request = requests.get(url)
soup = BeautifulSoup(url_request.content, 'html.parser')
city_list = soup.find(id="ctl00_WeatherContentHolder_ddlCity")
print(soup.head)
city_as_on_website = city_list.find(text=re.compile(city, re.I)).parent
cityId = city_as_on_website['value']
json_url = "http://weather.news24.com/ajaxpro/TwentyFour.Weather.Web.Ajax,App_Code.ashx"
headers = {
'Content-Type': 'text/plain; charset=UTF-8',
'Host': 'weather.news24.com',
'Origin': 'http://weather.news24.com',
'Referer': url,
'User-Agent': 'Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/48.0.2564.82 Chrome/48.0.2564.82 Safari/537.36',
'X-AjaxPro-Method': 'GetCurrentOne'}
payload = {
"cityId": cityId
}
request_post = requests.post(json_url, headers=headers, data=json.dumps(payload))
print(request_post.content)
context['Observations'] = request_post.content
return context
我收到错误。这是对它的追溯:
cityDict = json.loads(str(html))
任何帮助都将很高兴。
答案 0 :(得分:1)
request_post.content
内的JSON数据存在两个问题:
那里有JS日期对象值,例如:
"Date":new Date(Date.UTC(2016,1,26,22,0,0,0))
最后有不需要的字符:;/*"
。
让我们清理JSON数据,以便加载json
:
from datetime import datetime
data = request_post.text
def convert_date(match):
return '"' + datetime(*map(int, match.groups())).strftime("%Y-%m-%dT%H:%M:%S") + '"'
data = re.sub(r"new Date\(Date\.UTC\((\d+),(\d+),(\d+),(\d+),(\d+),(\d+),(\d+)\)\)",
convert_date,
data)
data = data.strip(";/*")
data = json.loads(data)
context['Observations'] = data