我正在尝试编写一个动态从URL中读取XML数据的Python脚本(例如http://www.wrh.noaa.gov/mesowest/getobextXml.php?sid=KCQT&num=72)
XML的格式如下:
<station id="KCQT" name="Los Angeles / USC Campus Downtown" elev="179" lat="34.02355" lon="-118.29122" provider="NWS/FAA">
<ob time="04 Oct 7:10 pm" utime="1507169400">
<variable var="T" description="Temp" unit="F" value="61"/>
<variable var="TD" description="Dewp" unit="F" value="39"/>
<variable var="RH" description="Relh" unit="%" value="45"/>
</ob>
<ob time="04 Oct 7:05 pm" utime="1507169100">
<variable var="T" description="Temp" unit="F" value="61"/>
<variable var="TD" description="Dewp" unit="F" value="39"/>
<variable var="RH" description="Relh" unit="%" value="45"/>
</ob>
<ob time="04 Oct 7:00 pm" utime="1507168800">
<variable var="T" description="Temp" unit="F" value="61"/>
<variable var="TD" description="Dewp" unit="F" value="39"/>
<variable var="RH" description="Relh" unit="%" value="45"/>
</ob>
<ob time="04 Oct 6:55 pm" utime="1507168500">
<variable var="T" description="Temp" unit="F" value="61"/>
<variable var="TD" description="Dewp" unit="F" value="39"/>
<variable var="RH" description="Relh" unit="%" value="45"/>
</ob>
</station>
我只想检索所有可用日期的时间戳和小数点温度(&#34; Temp&#34;)(包括超过4个)。
输出应该是CSV格式的文本文件,其中时间戳和温度值每行打印一对。
以下是我对代码的尝试(这很糟糕,根本不起作用):
import requests
weatherXML = requests.get("http://www.wrh.noaa.gov/mesowest/getobextXml.php?sid=KCQT&num=72")
import xml.etree.ElementTree as ET
import csv
tree = ET.parse(weatherXML)
root = tree.getroot()
# open file for writing
Time_Temp = open('timestamp_temp.csv', 'w')
#csv writer object
csvwriter = csv.writer(Time_Temp)
time_temp = []
count = 0
for member in root.findall('ob'):
if count == 0:
temperature = member.find('T').var
time_temp.append(temperature)
csvwriter.writerow(time_temp)
count = count + 1
temperature = member.find('T').text
time_temp.append(temperature)
Time_Temp.close()
请帮忙。
答案 0 :(得分:0)
您可以先迭代元素ob
,获取元素time
的属性ob
,找到var
为T
的元素变量并获取温度元素value
,将它们附加到列表中,并将其写入csv文件:
import xml.etree.ElementTree as ET
import csv
tree = ET.parse('getobextXml.php.xml')
root = tree.getroot()
# open file for writing
with open('timestamp_temp.csv', 'wb') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerow(["Time","Temp"])
for ob in root.iter('ob'):
time_temp = []
timestamp = ob.get('time') #get the attribute time of element ob
temp = ob.find("./variable[@var='T']").get('value') #find element variable which var is T, and get the element value
time_temp.append(timestamp)
time_temp.append(temp)
csvwriter.writerow(time_temp)
之后你会发现timestamp_temp.csv
会给你结果:
Time,Temp
04 Oct 8:47 pm,68
04 Oct 7:47 pm,68
04 Oct 6:47 pm,70
04 Oct 5:47 pm,74
04 Oct 4:47 pm,75
04 Oct 3:47 pm,75
04 Oct 2:47 pm,77
04 Oct 1:47 pm,78
04 Oct 12:47 pm,78
04 Oct 11:47 am,76
04 Oct 10:47 am,74
04 Oct 9:47 am,72
...
答案 1 :(得分:0)
假设Python 3,这将有效。如果需要,我注意到Python 2的区别:
import xml.etree.ElementTree as ET
import requests
import csv
weatherXML = requests.get("http://www.wrh.noaa.gov/mesowest/getobextXml.php?sid=KCQT&num=72")
root = ET.fromstring(weatherXML.text)
# Use this with Python 2
# with open('timestamp_temp.csv','wb') as Time_Temp:
with open('timestamp_temp.csv','w',newline='') as Time_Temp:
csvwriter = csv.writer(Time_Temp)
csvwriter.writerow(['Time','Temp'])
for member in root.iterfind('ob'):
date = member.attrib['time']
temp = member.find("variable[@var='T']").attrib['value']
csvwriter.writerow([date,temp])
输出:
Time,Temp
04 Oct 11:47 pm,65
04 Oct 10:47 pm,66
04 Oct 9:47 pm,68
04 Oct 8:47 pm,68
04 Oct 7:47 pm,68
04 Oct 6:47 pm,70
04 Oct 5:47 pm,74
04 Oct 4:47 pm,75
.
.