在csv文件中查找每个月的最高温度?

时间:2016-12-17 17:21:55

标签: python python-3.x csv

我需要一些帮助。所以我有一个large csv file(+8785行)。

所以,我基本上需要的是获得每个月的最高温度。例如(输出):

Month Max Temperature

January 5.3
February 6.1
March 25.5
...

我写了这个:

temp = open("weather_2012.csv","r")
total = 0
maxt = 0.0

for line in temp:
    try:
        p = float(line.split(",")[1])
        total += 1
        maxt = max(maxt,p)
    except:
        pass

print("Maximum:",maxt)

但它只有一个月的最高温度(整体):

Maximum: 33.0

4 个答案:

答案 0 :(得分:3)

我认为这是一个好方法,因为它避免了将许多(如果不是大多数)值硬编码到所需代码中(因此可以在任何年份使用,并使用特定于语言环境的月份名称):

from calendar import month_name
import csv
from datetime import datetime
import sys

filename = 'weather_2012.csv'
max_temps = [-sys.maxsize] * 13  # has extra [0] entry

with open(filename, 'r', newline='') as csvfile:
    reader = csv.reader(csvfile); next(reader)  # skip header row
    for date, high_temp, *_ in reader:
        month = datetime.strptime(date, '%Y-%m-%d %H:%M:%S').month
        max_temps[month] = max(max_temps[month], float(high_temp))

print('Monthly Max Temperatures\n')
longest = max(len(month) for month in month_name)  # length of longest month name
for month, temp in enumerate(max_temps[1:], 1):
    print('{:>{width}}: {:5.1f}'.format(month_name[month], temp, width=longest))

输出:

Monthly Max Temperatures

  January:   5.3
 February:   6.1
    March:  25.5
    April:  27.8
      May:  31.2
     June:  33.0
     July:  33.0
   August:  32.8
September:  28.4
  October:  21.1
 November:  17.5
 December:  11.9

答案 1 :(得分:1)

你必须找到一个,但所有十二个最大值。您可以从月份名称列表开始,并在此列表中查找每个月的最大值。在csv文件中,月份位于第一个元素的字符位置5到6中。

使用此数据格式......

Date/Time,Temp (C),Dew Point Temp (C),Rel Hum (%),Wind Spd (km/h),Visibility (km),Stn Press (kPa),Weather
2012-01-01 00:00:00,-1.8,-3.9,86,4,8.0,101.24,Fog
2012-01-01 01:00:00,-1.8,-3.7,87,4,8.0,101.24,Fog
2012-01-01 02:00:00,-1.8,-3.4,89,7,4.0,101.26,"Freezing Drizzle,Fog"
2012-01-01 03:00:00,-1.5,-3.2,88,6,4.0,101.27,"Freezing Drizzle,Fog"
2012-01-01 04:00:00,-1.5,-3.3,88,7,4.8,101.23,Fog
… to be continued

...你可以通过这个程序找到最大值:

month=["January","February","March","April","May","June","July",
       "August","September","October","November","December"]
maxt = {}
with open("weather_2012.csv","r") as temp:
    for line in temp:
        try: # is there valid data in line?
            m0, p0, *junk = line.split(",")
            p = float(p0)
            m = month[int(m0[5:7])-1]
            try: # do we already have data for this month?
                maxt[m] = max (p, maxt[m])
            except: # first data of this month 
                maxt[m] = p
        except: # skip this line
            pass

print("Maxima:")        
for m in month:
    print("%s: %g"%(m,maxt[m]))

答案 2 :(得分:0)

首先,您必须在第一列中按月过滤每个值,然后您可以找到每个月的最高温度

希望下一个代码可以帮助您:

import csv
months= {
    "01": "January",
    "02": "February",
    "03": "March",
    "04": "April",
    "05": "May",
    "06": "June",
    "07": "July",
    "08": "August",
    "09": "September",
    "10": "October",
    "11": "November","12": "December"
}

weather_file = csv.DictReader(open("weather_2012.csv", 'r'), delimiter=',', quotechar='"')

results = {}

for row in weather_file:
    # get month
    month = row["Date/Time"].split(" ")[0].split("-")[1]
    if not (month in results):
        results[month] = {
            "max": float(row["Temp (C)"])
        }
        continue

    if float(row["Temp (C)"]) > results[month]["max"]:
        results[month]["max"] = float(row["Temp (C)"])

# ordering and showing
print "Max temp by month:"
for month in sorted(results, key=lambda results: results):
    # do some stuff about month, to this case only show
    print "%s: %.2f" % (months[month], results[month]["max"])

输出: Max temp by month: January: 5.3 February: 6.1 March: 25.5 April: 27.8 May: 31.2 June: 33.0 July: 33.0 August: 32.8 September: 28.4 October: 21.1 November: 17.5 December: 11.9

答案 3 :(得分:0)

另一种解决方案可能是这样的:

#-*- coding: utf-8 -*-
import csv
import datetime
import itertools
import collections

fd = open('weather_2012.csv', 'rb')
reader = csv.DictReader(fd, delimiter=',')
rows = []
for row in reader:
    row['yearmonth'] = datetime.datetime.strptime(row['Date/Time'],  '%Y-%m-%d %H:%M:%S').strftime('%Y%m')
    rows.append(row)
fd.close()
# sort them
rows.sort(key=lambda r: r['yearmonth'])
ans = collections.OrderedDict()
for yearmonth, values in itertools.groupby(rows, lambda r: r['yearmonth']):
    ans[yearmonth] = max([float(r['Temp (C)']) for r in values])

print ans

此解决方案首先根据年度字符串对数据进行排序,然后使用groupby内置函数。