我正在尝试使用python将有关API流量的信息保存到excel。
基本上我现在可以获取一次数据,但是我希望它在标题下面,然后在csv输出中的每次新迭代中连续下去;在该示例中,标题是交通行程时间,以米为单位的长度,出发时间和以秒为单位的交通延迟。我基本上每隔10分钟查看一次交通数据/无论什么时候说。
那么我将如何创建标题来分隔列,然后让数据低于这些列,以获得excel输出。我假设这与我如何将信息附加到我设置为数据的变量有关。
基本上看起来像这样,
Traffic Delay - Length In Meters - Departure Time - etc
Data at 0time - Data at 0time - Data at 0time - etc
Data at10time - Data at10time - Data at10time - etc
到目前为止我写的剧本如下。
from lxml import etree
import urllib.request
import csv
#Pickle is not needed
#append to list next
def handleLeg(leg):
# print this leg as text, or save it to file maybe...
text = etree.tostring(leg, pretty_print=True)
# also process individual elements of interest here if we want
tagsOfInterest=["noTrafficTravelTimeInSeconds", "lengthInMeters", "departureTime", "trafficDelayInSeconds"] # whatever
#list to use for data analysis
global data
data = []
#create header dictionary that includes the data to be appended within it. IE, Header = {TrafficDelay[data(0)]...etc
for child in leg:
if 'summary' in child.tag:
for elem in child:
for item in tagsOfInterest:
if item in elem.tag:
data.append(elem.text)
def parseXML(xmlFile):
While option
lastTime = time.time() - 600
while time.time() >= lastTime + 600:
lastTime += 600
#Parse the xml
#Threading way to run every couple of seconds
#threading.Timer(5.0, parseXML, ["xmlFile"]).start()
with urllib.request.urlopen("https://api.tomtom.com/routing/1/calculateRoute/-37.79205923474775,145.03010268799338:-37.798883995180496,145.03040309540322:-37.807106781970354,145.02895470253526:-37.80320743019992,145.01021142594075:-37.7999012967757,144.99318476311566:?routeType=shortest&key=xxxx&computeTravelTimeFor=all") as fobj:
xml = fobj.read()
root = etree.fromstring(xml)
for child in root:
if 'route' in child.tag:
handleLeg(child)
# Write CSV file
with open('datafile.csv', 'w') as fp:
writer = csv.writer(fp, delimiter=' ')
# writer.writerow(["your", "header", "foo"]) # write header
writer.writerows(data)
"""for elem in child:
if 'leg' in elem.tag:
handleLeg(elem)
"""
if __name__ == "__main__":
parseXML("xmlFile")
with open('datafile.csv', 'r') as fp:
reader = csv.reader(fp, quotechar='"')
# next(reader, None) # skip the headers
data_read = [row for row in reader]
print(data_read)
这是一个API出现的例子(它的XML)
<calculateRouteResponse xmlns="http://api.tomtom.com/routing" formatVersion="0.0.12">
<copyright>...</copyright>
<privacy>...</privacy>
<route>
<summary>
<lengthInMeters>5144</lengthInMeters>
<travelTimeInSeconds>687</travelTimeInSeconds>
<trafficDelayInSeconds>0</trafficDelayInSeconds>
<departureTime>2018-01-16T11:16:06+11:00</departureTime>
<arrivalTime>2018-01-16T11:27:33+11:00</arrivalTime>
<noTrafficTravelTimeInSeconds>478</noTrafficTravelTimeInSeconds>
<historicTrafficTravelTimeInSeconds>687</historicTrafficTravelTimeInSeconds>
<liveTrafficIncidentsTravelTimeInSeconds>687</liveTrafficIncidentsTravelTimeInSeconds>
</summary>
<leg>
<summary>
<lengthInMeters>806</lengthInMeters>
<travelTimeInSeconds>68</travelTimeInSeconds>
<trafficDelayInSeconds>0</trafficDelayInSeconds>
<departureTime>2018-01-16T11:16:06+11:00</departureTime>
<arrivalTime>2018-01-16T11:17:14+11:00</arrivalTime>
<noTrafficTravelTimeInSeconds>59</noTrafficTravelTimeInSeconds>
<historicTrafficTravelTimeInSeconds>68</historicTrafficTravelTimeInSeconds>
<liveTrafficIncidentsTravelTimeInSeconds>68</liveTrafficIncidentsTravelTimeInSeconds>
</summary>
<points>...</points>
</leg>
<leg>
<summary>
<lengthInMeters>958</lengthInMeters>
<travelTimeInSeconds>114</travelTimeInSeconds>
<trafficDelayInSeconds>0</trafficDelayInSeconds>
<departureTime>2018-01-16T11:17:14+11:00</departureTime>
<arrivalTime>2018-01-16T11:19:08+11:00</arrivalTime>
<noTrafficTravelTimeInSeconds>77</noTrafficTravelTimeInSeconds>
<historicTrafficTravelTimeInSeconds>114</historicTrafficTravelTimeInSeconds>
<liveTrafficIncidentsTravelTimeInSeconds>114</liveTrafficIncidentsTravelTimeInSeconds>
</summary>
<points>...</points>
</leg>
<leg>
<summary>
<lengthInMeters>1798</lengthInMeters>
<travelTimeInSeconds>224</travelTimeInSeconds>
<trafficDelayInSeconds>0</trafficDelayInSeconds>
<departureTime>2018-01-16T11:19:08+11:00</departureTime>
<arrivalTime>2018-01-16T11:22:53+11:00</arrivalTime>
<noTrafficTravelTimeInSeconds>181</noTrafficTravelTimeInSeconds>
<historicTrafficTravelTimeInSeconds>224</historicTrafficTravelTimeInSeconds>
<liveTrafficIncidentsTravelTimeInSeconds>224</liveTrafficIncidentsTravelTimeInSeconds>
</summary>
<points>...</points>
</leg>
<leg>
<summary>
<lengthInMeters>1582</lengthInMeters>
<travelTimeInSeconds>280</travelTimeInSeconds>
<trafficDelayInSeconds>0</trafficDelayInSeconds>
<departureTime>2018-01-16T11:22:53+11:00</departureTime>
<arrivalTime>2018-01-16T11:27:33+11:00</arrivalTime>
<noTrafficTravelTimeInSeconds>160</noTrafficTravelTimeInSeconds>
<historicTrafficTravelTimeInSeconds>280</historicTrafficTravelTimeInSeconds>
<liveTrafficIncidentsTravelTimeInSeconds>280</liveTrafficIncidentsTravelTimeInSeconds>
</summary>
<points>...</points>
</leg>
<sections>
<section>
<startPointIndex>0</startPointIndex>
<endPointIndex>139</endPointIndex>
<sectionType>TRAVEL_MODE</sectionType>
<travelMode>car</travelMode>
</section>
</sections>
</route>
</calculateRouteResponse>
真的很感谢你的帮助 - 我对目前如何处理感到非常困惑。
答案 0 :(得分:1)
xml和csv python库中有各种工具,以及将xml解析为csv的多种方法。
http://blog.appliedinformaticsinc.com/how-to-parse-and-convert-xml-to-csv-using-python/
似乎已经写了一个例子,如果有点冗长......
我建议您阅读库中的文档,然后使用它们以您认为合适的方式转换数据。
https://docs.python.org/2/library/xml.html
https://docs.python.org/3/library/csv.html
响应OP评论更新:
使用while循环。
lastTime = time.time() - 600
while time.time() >= lastTime + 600:
lastTime += 600
do whatever here