历年同月的每日数据

时间:2019-10-16 02:28:18

标签: python-3.x pandas time-series

我有一段时间内同一月的数据,我试图按平均日绘制平均值,但我不知道该怎么做。 This is how the dataframe looks like

获取数据框的主要代码:

import requests
import pandas as pd
from bs4 import BeautifulSoup as bs
import matplotlib.pyplot as plt
from datetime import date, timedelta
from datetime import datetime

inicio = date(1973, 1, 1)
#inicio = date(2019, 2, 15)
#final = date(2000, 10, 10)
final = date(1974, 3, 1)
delta = timedelta(days=1)
años=[]
links=[]
while inicio <= final:
    fechas=inicio.strftime("%Y-%m-%d")
    #años.append(datetime.strptime(fechas, '%Y-%m-%d').date())
    años.append(fechas)
    url='http://weather.uwyo.edu/cgi-bin/sounding?region=samer&TYPE=TEXT%3ALIST&YEAR={}&MONTH={}&FROM={}12&TO={}12&STNM=80222'.format(fechas[0:4],fechas[5:7],fechas[8:10],fechas[8:10])
    links.append(url)
    inicio += delta


d = dict(zip(años, links))
df1=pd.DataFrame(list(d.items()), columns=['Fecha', 'url'])
df1.set_index('Fecha', inplace=True)

Enero=pd.DataFrame()
Febrero=pd.DataFrame()

for i in df1.index:
    if i[5:7]=='01':
        Enero = Enero.append(df1.loc[i], ignore_index=False)
    elif i[5:7]=='02':
        Febrero = Febrero.append(df1.loc[i], ignore_index=False)

labels = ['PRES', 'HGHT', 'TEMP', 'DWPT', 'RELH', 'MIXR', 'DRCT', 'SKNT', 'THTA', 'THTE', 'THTV']

def reques(url):
    try:
        results = []
        peticion=requests.get(url)
        soup=bs(peticion.content, 'lxml')
        pre = (soup.select_one('pre')).text
        for line in pre.split('\n')[4:-1]:
            #print (line)
            if '--' not in line:
                row = [line[i:i+7].strip() for i in range(0, len(line), 7)]
                results.append(row)

            else:
                pass

        df5=pd.DataFrame.from_records(results, columns=labels)
        #return x
        return df5

    except AttributeError:
        pass

SuperDF = pd.DataFrame()
SuperDF = pd.DataFrame(columns=labels)


startTime = datetime.now()
sin_datos=[]
for i in Febrero['url']:

    try:

        x=reques(i)
        df2=x
        y=str(df1[df1['url']==i].index.values)
        df2.index = [y] * len(x)
        SuperDF=SuperDF.append(x)


    except TypeError:
        sin_datos.append(df1[df1['url']==i].index.values)
        print (df1[df1['url']==i].index.values)



SuperDF.index= SuperDF.index.map(lambda x: x.lstrip("['").rstrip("]''"))
SuperDF.index = pd.to_datetime(SuperDF.index)
SuperDF=SuperDF.apply(pd.to_numeric)
SuperDF

我一直在尝试这样做

import seaborn as sns


SuperDF = SuperDF[(SuperDF['TEMP']==0)]


ax = SuperDF.loc['02', 'RELH'].plot(marker='o', linestyle='-')
ax.set_ylabel('RELH');

但是我遇到了这个错误

KeyError: '02'

当我过了一年,但我需要一个月的日均值时,它可以工作。任何帮助将不胜感激。
This is what I need

0 个答案:

没有答案