使用csv文件,找到温度的平均值

时间:2016-12-18 18:51:52

标签: python python-3.x

我需要一些帮助。所以我有一个 large csv file(+8785行)。

Date/Time,Temp (C),Dew Point Temp (C),Rel Hum (%),Wind Spd (km/h),Visibility (km),Stn Press (kPa),Weather
2012-01-01 00:00:00,-1.8,-3.9,86,4,8.0,101.24,Fog
2012-01-01 01:00:00,-1.8,-3.7,87,4,8.0,101.24,Fog
2012-01-01 02:00:00,-1.8,-3.4,89,7,4.0,101.26,"Freezing Drizzle,Fog"
2012-01-01 03:00:00,-1.5,-3.2,88,6,4.0,101.27,"Freezing Drizzle,Fog"
2012-01-01 04:00:00,-1.5,-3.3,88,7,4.8,101.23,Fog
2012-01-01 05:00:00,-1.4,-3.3,87,9,6.4,101.27,Fog
2012-01-01 06:00:00,-1.5,-3.1,89,7,6.4,101.29,Fog
2012-01-01 07:00:00,-1.4,-3.6,85,7,8.0,101.26,Fog
2012-01-01 08:00:00,-1.4,-3.6,85,9,8.0,101.23,Fog
2012-01-01 09:00:00,-1.3,-3.1,88,15,4.0,101.2,Fog
2012-01-01 10:00:00,-1.0,-2.3,91,9,1.2,101.15,Fog
2012-01-01 11:00:00,-0.5,-2.1,89,7,4.0,100.98,Fog
2012-01-01 12:00:00,-0.2,-2.0,88,9,4.8,100.79,Fog
2012-01-01 13:00:00,0.2,-1.7,87,13,4.8,100.58,Fog
2012-01-01 14:00:00,0.8,-1.1,87,20,4.8,100.31,Fog
2012-01-01 15:00:00,1.8,-0.4,85,22,6.4,100.07,Fog
2012-01-01 16:00:00,2.6,-0.2,82,13,12.9,99.93,Mostly Cloudy
2012-01-01 17:00:00,3.0,0.0,81,13,16.1,99.81,Cloudy
2012-01-01 18:00:00,3.8,1.0,82,15,12.9,99.74,Rain

所以,我基本上需要的是获得每个温度的平均值。例如(输出):

Weather Mean Temperature
Clear 6.825716
Cloudy 7.970544
Drizzle 7.353659
Drizzle,Fog 8.067500
Drizzle,Ice Pellets,Fog 0.400000
Drizzle,Snow 1.050000
Drizzle,Snow,Fog 0.693333
Fog 4.303333
Freezing Drizzle -5.657143
Freezing Drizzle,Fog -2.533333
Freezing Drizzle,Haze -5.433333
........

我有什么:

import csv
weather_file = csv.DictReader(open("weather_2012.csv", 'r'), 
                              delimiter=',', quotechar='"')

results = {}

for row in weather_file:

    weather = row["Weather"].split(" "" ")
    if not (weather in results):
        results[weather] = {
            "max": float(row["Temp (C)"])
        }
        continue

    if float(row["Temp (C)"]) > results[weather]["max"]:
        results[weather]["max"] = float(row["Temp (C)"])

y=[]
print("Weather   Mean Temperature")
for month in sorted(results, key=lambda results: results):
    y.append(results[month]["max"])

    print("%s %.1f" % (weather[month], results[month]["max"]))

我必须找到某些温度和含义的平均值......

某些天气条件有一定的温度。我必须用天气条件定义(排序)所有温度。例如:

  <多云“天气状况已超过+300。我必须找到它的平均温度并定义为“多云”天气。

2 个答案:

答案 0 :(得分:2)

以下是一种方法:

#!/usr/bin/env python3
import csv
from pprint import pprint

filename = 'weather_2012.csv'
condition_mean_temps = {}

# Initially associate a list of temperature values with each condition.
with open(filename, 'r', newline='') as csvfile:
    reader = csv.reader(csvfile); next(reader)  # skip header row
    # Only need second and last value from each row of csv data file.
    for _, temperature, *_, condition in reader:
        condition_mean_temps.setdefault(condition, []).append(float(temperature))

# (Re)associate the mean of the associated list of values with each condition.
condition_mean_temps = {condition: round(sum(temperatures)/len(temperatures), 2)
                            for condition, temperatures
                                in condition_mean_temps.items()}

pprint(condition_mean_temps)

输出:

{'Clear': 6.83,
 'Cloudy': 7.97,
 'Drizzle': 7.35,
 'Drizzle,Fog': 8.07,
 'Drizzle,Ice Pellets,Fog': 0.4,
 'Drizzle,Snow': 1.05,
 'Drizzle,Snow,Fog': 0.69,
 'Fog': 4.3,
 'Freezing Drizzle': -5.66,
 'Freezing Drizzle,Fog': -2.53,
 'Freezing Drizzle,Haze': -5.43,
 'Freezing Drizzle,Snow': -5.11,
 'Freezing Fog': -7.58,
 'Freezing Rain': -3.89,
 'Freezing Rain,Fog': -2.22,
 'Freezing Rain,Haze': -4.9,
 'Freezing Rain,Ice Pellets,Fog': -2.6,
 'Freezing Rain,Snow Grains': -5.0,
 'Haze': -0.2,
 'Mainly Clear': 12.56,
 'Moderate Rain,Fog': 1.7,
 'Moderate Snow': -5.53,
 'Moderate Snow,Blowing Snow': -5.45,
 'Mostly Cloudy': 10.57,
 'Rain': 9.79,
 'Rain Showers': 13.72,
 'Rain Showers,Fog': 12.8,
 'Rain Showers,Snow Showers': 2.15,
 'Rain,Fog': 8.27,
 'Rain,Haze': 4.63,
 'Rain,Ice Pellets': 0.6,
 'Rain,Snow': 1.06,
 'Rain,Snow Grains': 1.9,
 'Rain,Snow,Fog': 0.8,
 'Rain,Snow,Ice Pellets': 1.1,
 'Snow': -4.52,
 'Snow Pellets': 0.7,
 'Snow Showers': -3.51,
 'Snow Showers,Fog': -10.68,
 'Snow,Blowing Snow': -5.41,
 'Snow,Fog': -5.08,
 'Snow,Haze': -4.02,
 'Snow,Ice Pellets': -1.88,
 'Thunderstorms': 24.15,
 'Thunderstorms,Heavy Rain Showers': 10.9,
 'Thunderstorms,Moderate Rain Showers,Fog': 19.6,
 'Thunderstorms,Rain': 20.43,
 'Thunderstorms,Rain Showers': 20.04,
 'Thunderstorms,Rain Showers,Fog': 21.6,
 'Thunderstorms,Rain,Fog': 20.6}

答案 1 :(得分:2)

以下是使用Pandas

进行此操作的一种方法
 var holder = PreviousPage.Master.FindControl("MainContent");

 var login = ((TextBox)holder.FindControl("login")).Text;

 var password = ((TextBox)holder.FindControl("password")).Text;

我假设数据存储在import numpy as np import pandas as pd d = pd.read_csv("test.csv") means = d.groupby('Weather')['Temp (C)'].mean() print means 文件中。

pandas是一个数据分析库,它有三个基本概念Series,DataFrame和Panel。我们在这里创建一个数据框。您可以将其视为数据的列行表示。这正是csv所做的。因此,使用pandas与csv一起工作非常容易。

要了解更多信息,请查看此问题 - http://pandas.pydata.org/

此特定解决方案可在此处找到 - http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html