我需要一些帮助。所以我有一个large csv file(+8785行)。
所以,我基本上需要的是获得每个月的最高温度。例如(输出):
Month Max Temperature
January 5.3
February 6.1
March 25.5
...
我写了这个:
temp = open("weather_2012.csv","r")
total = 0
maxt = 0.0
for line in temp:
try:
p = float(line.split(",")[1])
total += 1
maxt = max(maxt,p)
except:
pass
print("Maximum:",maxt)
但它只有一个月的最高温度(整体):
Maximum: 33.0
答案 0 :(得分:3)
我认为这是一个好方法,因为它避免了将许多(如果不是大多数)值硬编码到所需代码中(因此可以在任何年份使用,并使用特定于语言环境的月份名称):
from calendar import month_name
import csv
from datetime import datetime
import sys
filename = 'weather_2012.csv'
max_temps = [-sys.maxsize] * 13 # has extra [0] entry
with open(filename, 'r', newline='') as csvfile:
reader = csv.reader(csvfile); next(reader) # skip header row
for date, high_temp, *_ in reader:
month = datetime.strptime(date, '%Y-%m-%d %H:%M:%S').month
max_temps[month] = max(max_temps[month], float(high_temp))
print('Monthly Max Temperatures\n')
longest = max(len(month) for month in month_name) # length of longest month name
for month, temp in enumerate(max_temps[1:], 1):
print('{:>{width}}: {:5.1f}'.format(month_name[month], temp, width=longest))
输出:
Monthly Max Temperatures
January: 5.3
February: 6.1
March: 25.5
April: 27.8
May: 31.2
June: 33.0
July: 33.0
August: 32.8
September: 28.4
October: 21.1
November: 17.5
December: 11.9
答案 1 :(得分:1)
你必须找到一个,但所有十二个最大值。您可以从月份名称列表开始,并在此列表中查找每个月的最大值。在csv文件中,月份位于第一个元素的字符位置5到6中。
使用此数据格式......
Date/Time,Temp (C),Dew Point Temp (C),Rel Hum (%),Wind Spd (km/h),Visibility (km),Stn Press (kPa),Weather
2012-01-01 00:00:00,-1.8,-3.9,86,4,8.0,101.24,Fog
2012-01-01 01:00:00,-1.8,-3.7,87,4,8.0,101.24,Fog
2012-01-01 02:00:00,-1.8,-3.4,89,7,4.0,101.26,"Freezing Drizzle,Fog"
2012-01-01 03:00:00,-1.5,-3.2,88,6,4.0,101.27,"Freezing Drizzle,Fog"
2012-01-01 04:00:00,-1.5,-3.3,88,7,4.8,101.23,Fog
… to be continued
...你可以通过这个程序找到最大值:
month=["January","February","March","April","May","June","July",
"August","September","October","November","December"]
maxt = {}
with open("weather_2012.csv","r") as temp:
for line in temp:
try: # is there valid data in line?
m0, p0, *junk = line.split(",")
p = float(p0)
m = month[int(m0[5:7])-1]
try: # do we already have data for this month?
maxt[m] = max (p, maxt[m])
except: # first data of this month
maxt[m] = p
except: # skip this line
pass
print("Maxima:")
for m in month:
print("%s: %g"%(m,maxt[m]))
答案 2 :(得分:0)
首先,您必须在第一列中按月过滤每个值,然后您可以找到每个月的最高温度
希望下一个代码可以帮助您:
import csv
months= {
"01": "January",
"02": "February",
"03": "March",
"04": "April",
"05": "May",
"06": "June",
"07": "July",
"08": "August",
"09": "September",
"10": "October",
"11": "November","12": "December"
}
weather_file = csv.DictReader(open("weather_2012.csv", 'r'), delimiter=',', quotechar='"')
results = {}
for row in weather_file:
# get month
month = row["Date/Time"].split(" ")[0].split("-")[1]
if not (month in results):
results[month] = {
"max": float(row["Temp (C)"])
}
continue
if float(row["Temp (C)"]) > results[month]["max"]:
results[month]["max"] = float(row["Temp (C)"])
# ordering and showing
print "Max temp by month:"
for month in sorted(results, key=lambda results: results):
# do some stuff about month, to this case only show
print "%s: %.2f" % (months[month], results[month]["max"])
输出:
Max temp by month:
January: 5.3
February: 6.1
March: 25.5
April: 27.8
May: 31.2
June: 33.0
July: 33.0
August: 32.8
September: 28.4
October: 21.1
November: 17.5
December: 11.9
答案 3 :(得分:0)
另一种解决方案可能是这样的:
#-*- coding: utf-8 -*-
import csv
import datetime
import itertools
import collections
fd = open('weather_2012.csv', 'rb')
reader = csv.DictReader(fd, delimiter=',')
rows = []
for row in reader:
row['yearmonth'] = datetime.datetime.strptime(row['Date/Time'], '%Y-%m-%d %H:%M:%S').strftime('%Y%m')
rows.append(row)
fd.close()
# sort them
rows.sort(key=lambda r: r['yearmonth'])
ans = collections.OrderedDict()
for yearmonth, values in itertools.groupby(rows, lambda r: r['yearmonth']):
ans[yearmonth] = max([float(r['Temp (C)']) for r in values])
print ans
此解决方案首先根据年度字符串对数据进行排序,然后使用groupby内置函数。