我需要一些帮助。所以我有一个 large csv file(+8785行)。
Date/Time,Temp (C),Dew Point Temp (C),Rel Hum (%),Wind Spd (km/h),Visibility (km),Stn Press (kPa),Weather
2012-01-01 00:00:00,-1.8,-3.9,86,4,8.0,101.24,Fog
2012-01-01 01:00:00,-1.8,-3.7,87,4,8.0,101.24,Fog
2012-01-01 02:00:00,-1.8,-3.4,89,7,4.0,101.26,"Freezing Drizzle,Fog"
2012-01-01 03:00:00,-1.5,-3.2,88,6,4.0,101.27,"Freezing Drizzle,Fog"
2012-01-01 04:00:00,-1.5,-3.3,88,7,4.8,101.23,Fog
2012-01-01 05:00:00,-1.4,-3.3,87,9,6.4,101.27,Fog
2012-01-01 06:00:00,-1.5,-3.1,89,7,6.4,101.29,Fog
2012-01-01 07:00:00,-1.4,-3.6,85,7,8.0,101.26,Fog
2012-01-01 08:00:00,-1.4,-3.6,85,9,8.0,101.23,Fog
2012-01-01 09:00:00,-1.3,-3.1,88,15,4.0,101.2,Fog
2012-01-01 10:00:00,-1.0,-2.3,91,9,1.2,101.15,Fog
2012-01-01 11:00:00,-0.5,-2.1,89,7,4.0,100.98,Fog
2012-01-01 12:00:00,-0.2,-2.0,88,9,4.8,100.79,Fog
2012-01-01 13:00:00,0.2,-1.7,87,13,4.8,100.58,Fog
2012-01-01 14:00:00,0.8,-1.1,87,20,4.8,100.31,Fog
2012-01-01 15:00:00,1.8,-0.4,85,22,6.4,100.07,Fog
2012-01-01 16:00:00,2.6,-0.2,82,13,12.9,99.93,Mostly Cloudy
2012-01-01 17:00:00,3.0,0.0,81,13,16.1,99.81,Cloudy
2012-01-01 18:00:00,3.8,1.0,82,15,12.9,99.74,Rain
所以,我基本上需要的是获得每个温度的平均值。例如(输出):
Weather Mean Temperature
Clear 6.825716
Cloudy 7.970544
Drizzle 7.353659
Drizzle,Fog 8.067500
Drizzle,Ice Pellets,Fog 0.400000
Drizzle,Snow 1.050000
Drizzle,Snow,Fog 0.693333
Fog 4.303333
Freezing Drizzle -5.657143
Freezing Drizzle,Fog -2.533333
Freezing Drizzle,Haze -5.433333
........
我有什么:
import csv
weather_file = csv.DictReader(open("weather_2012.csv", 'r'),
delimiter=',', quotechar='"')
results = {}
for row in weather_file:
weather = row["Weather"].split(" "" ")
if not (weather in results):
results[weather] = {
"max": float(row["Temp (C)"])
}
continue
if float(row["Temp (C)"]) > results[weather]["max"]:
results[weather]["max"] = float(row["Temp (C)"])
y=[]
print("Weather Mean Temperature")
for month in sorted(results, key=lambda results: results):
y.append(results[month]["max"])
print("%s %.1f" % (weather[month], results[month]["max"]))
我必须找到某些温度和含义的平均值......
某些天气条件有一定的温度。我必须用天气条件定义(排序)所有温度。例如:
<多云“天气状况已超过+300。我必须找到它的平均温度并定义为“多云”天气。
答案 0 :(得分:2)
以下是一种方法:
#!/usr/bin/env python3
import csv
from pprint import pprint
filename = 'weather_2012.csv'
condition_mean_temps = {}
# Initially associate a list of temperature values with each condition.
with open(filename, 'r', newline='') as csvfile:
reader = csv.reader(csvfile); next(reader) # skip header row
# Only need second and last value from each row of csv data file.
for _, temperature, *_, condition in reader:
condition_mean_temps.setdefault(condition, []).append(float(temperature))
# (Re)associate the mean of the associated list of values with each condition.
condition_mean_temps = {condition: round(sum(temperatures)/len(temperatures), 2)
for condition, temperatures
in condition_mean_temps.items()}
pprint(condition_mean_temps)
输出:
{'Clear': 6.83,
'Cloudy': 7.97,
'Drizzle': 7.35,
'Drizzle,Fog': 8.07,
'Drizzle,Ice Pellets,Fog': 0.4,
'Drizzle,Snow': 1.05,
'Drizzle,Snow,Fog': 0.69,
'Fog': 4.3,
'Freezing Drizzle': -5.66,
'Freezing Drizzle,Fog': -2.53,
'Freezing Drizzle,Haze': -5.43,
'Freezing Drizzle,Snow': -5.11,
'Freezing Fog': -7.58,
'Freezing Rain': -3.89,
'Freezing Rain,Fog': -2.22,
'Freezing Rain,Haze': -4.9,
'Freezing Rain,Ice Pellets,Fog': -2.6,
'Freezing Rain,Snow Grains': -5.0,
'Haze': -0.2,
'Mainly Clear': 12.56,
'Moderate Rain,Fog': 1.7,
'Moderate Snow': -5.53,
'Moderate Snow,Blowing Snow': -5.45,
'Mostly Cloudy': 10.57,
'Rain': 9.79,
'Rain Showers': 13.72,
'Rain Showers,Fog': 12.8,
'Rain Showers,Snow Showers': 2.15,
'Rain,Fog': 8.27,
'Rain,Haze': 4.63,
'Rain,Ice Pellets': 0.6,
'Rain,Snow': 1.06,
'Rain,Snow Grains': 1.9,
'Rain,Snow,Fog': 0.8,
'Rain,Snow,Ice Pellets': 1.1,
'Snow': -4.52,
'Snow Pellets': 0.7,
'Snow Showers': -3.51,
'Snow Showers,Fog': -10.68,
'Snow,Blowing Snow': -5.41,
'Snow,Fog': -5.08,
'Snow,Haze': -4.02,
'Snow,Ice Pellets': -1.88,
'Thunderstorms': 24.15,
'Thunderstorms,Heavy Rain Showers': 10.9,
'Thunderstorms,Moderate Rain Showers,Fog': 19.6,
'Thunderstorms,Rain': 20.43,
'Thunderstorms,Rain Showers': 20.04,
'Thunderstorms,Rain Showers,Fog': 21.6,
'Thunderstorms,Rain,Fog': 20.6}
答案 1 :(得分:2)
以下是使用Pandas
进行此操作的一种方法 var holder = PreviousPage.Master.FindControl("MainContent");
var login = ((TextBox)holder.FindControl("login")).Text;
var password = ((TextBox)holder.FindControl("password")).Text;
我假设数据存储在import numpy as np
import pandas as pd
d = pd.read_csv("test.csv")
means = d.groupby('Weather')['Temp (C)'].mean()
print means
文件中。
pandas是一个数据分析库,它有三个基本概念Series,DataFrame和Panel。我们在这里创建一个数据框。您可以将其视为数据的列行表示。这正是csv所做的。因此,使用pandas与csv一起工作非常容易。
要了解更多信息,请查看此问题 - http://pandas.pydata.org/
此特定解决方案可在此处找到 - http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.groupby.html