Question

如前一个问题中所述，我使用美丽的python汤来从网站上检索天气数据。

以下是网站的外观：

<channel>
<title>2 Hour Forecast</title>
<source>Meteorological Services Singapore</source>
<description>2 Hour Forecast</description>
<item>
<title>Nowcast Table</title>
<category>Singapore Weather Conditions</category>
<forecastIssue date="18-07-2016" time="03:30 PM"/>
<validTime>3.30 pm to 5.30 pm</validTime>
<weatherForecast>
<area forecast="TL" lat="1.37500000" lon="103.83900000" name="Ang Mo Kio"/>
<area forecast="SH" lat="1.32100000" lon="103.92400000" name="Bedok"/>
<area forecast="TL" lat="1.35077200" lon="103.83900000" name="Bishan"/>
<area forecast="CL" lat="1.30400000" lon="103.70100000" name="Boon Lay"/>
<area forecast="CL" lat="1.35300000" lon="103.75400000" name="Bukit Batok"/>
<area forecast="CL" lat="1.27700000" lon="103.81900000" name="Bukit Merah"/>` 
<channel>

我设法检索forecastIssue date＆amp; validTime。但是，我无法检索不同的区域预测。

这是我的python代码：

import requests
from bs4 import BeautifulSoup
import urllib3

outfile = open('C:\scripts\idk.xml','w')

#getting the time

r = requests.get('http://www.nea.gov.sg/api/WebAPI/?   
dataset=2hr_nowcast&keyref=<keyrefno>')
soup = BeautifulSoup(r.content, "xml")
time = soup.find('validTime').string
print time

#print issue date and time
for currentdate in soup.findAll('item'):
string = currentdate.find('forecastIssue')
print string

这是我想要检索区域预测的部分，例如。 区域预测=＆＃34; TL＆＃34; LAT =＆＃34; 1.37500000＆＃34; LON =＆＃34; 103.83900000＆＃34; name =＆＃34; Ang Mo Kio＆＃34; /

for area in soup.findAll('weatherForecast'):
areastring = area.find('area')
print areastring

当我在python中运行我的代码时，它只检索了第一个区域，即Ang Mo Kio

示例输出：

2.30 pm to 5.30 pm
<forecastIssue date="22-07-2016" time="02:30 PM"/>
<area forecast="RA" lat="1.37500000" lon="103.83900000" name="Ang Mo Kio"/>

Inspect element of the website

如您所见，区域预测位于 div class

之内

如何遍历所有区域？我已经尝试使用谷歌搜索，但显然发现所有似乎都不适合我的代码
有没有办法分割日期和时间？
有没有办法可以将beautifulsoup检索到的数据解析成xml文件？由于我的输出在运行代码时不包含任何数据。

谢谢。

Answer 1

1.循环遍历所有区域，

areas = soup.select('area')
for data in areas:
    print(data.get('name'))

输出

Ang Mo Kio
Bedok
Bishan
Boon Lay
Bukit Batok
Bukit Merah

2.您也可以单独访问数据

date = soup.select('forecastissue')[0].get('date')
time = soup.select('forecastissue')[0].get('time')

Answer 2

当我在python中运行我的代码时，它只检索了第一个区域，即Ang Mo Kio

在提供XML的情况下，

"onmouseup"将返回一个元素的序列。然后继续遍历此序列并使用find('area')，它在找到1个元素后停止并返回该序列（如果有的话）。要查找 weatherForecast 中的所有区域元素：

findAll('weatherForecast')

有没有办法分割日期和时间？

不完全确定你的意思，也许你想从元素中提取值：

for area in soup.find('weatherForecast').find_all('area'):
    print area

Beautifulsoup通过HTML循环

2 个答案: