我正在尝试使用python 2.7将10年(1991-2000)的每日数据温度转换为大熊猫数月。我从网页上获取了数据(" http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/201j_EN.htm")。但我遇到了麻烦。数据如下:
` datum d_ta d_tx d_tn d_rs d_rf d_ss
---------- ----- ----- ----- ----- ---- -----
1991-01-01 3.0 5.4 1.5 0.2 1 0.0
1991-01-02 4.0 7.2 1.9 0.0 1 6.8
1991-01-03 6.0 8.8 3.6 0.0 1 2.5
1991-01-04 3.7 7.6 2.3 . 2.9
1991-01-05 4.9 7.2 1.5 . 0.0
1991-01-06 2.7 6.2 0.5 . 0.9
1991-01-07 4.0 8.4 1.9 . 3.2
1991-01-08 6.7 8.9 4.6 0.0 0 0.0
1991-01-09 4.1 8.0 3.0 0.3 0 0.0
1991-01-10 4.2 8.1 2.4 0.0 0 0.2
1991-01-11 4.7 6.9 3.6 . 0.7
1991-01-12 7.0 9.8 3.2 . 0.1
1991-01-13 6.3 8.2 4.6 . 0.0
1991-01-14 3.7 6.8 2.2 . 4.7
1991-01-15 0.7 3.4 -1.0 . 7.6
1991-01-16 -1.4 1.4 -3.0 . 7.5
1991-01-17 -2.5 2.1 -5.0 . 8.1
1991-01-18 -1.8 4.0 -5.1 . 7.0
1991-01-19 -3.0 0.1 -4.0 . 5.8
1991-01-20 -2.8 0.5 -5.2 . 5.6
1991-01-21 -5.0 -1.7 -7.8 . 0.0
1991-01-22 -3.3 -1.8 -4.2 . 0.0
1991-01-23 -1.7 0.4 -2.5 . 0.0
1991-01-24 0.0 3.2 -1.6 . 2.2
1991-01-25 1.1 5.1 -0.9 . 6.4
1991-01-26 0.6 4.5 -0.5 . 7.1
1991-01-27 -1.5 2.2 -4.0 . 0.0
1991-01-28 1.3 5.6 -0.8 . 3.8
1991-01-29 0.7 2.6 -0.4 . 1.1
1991-01-30 0.3 4.0 -1.2 . 7.3
1991-01-31 -5.0 -0.2 -7.4 . 8.0
1991-02-01 -8.1 -3.7 -11.7 . 7.6
1991-02-02 -7.0 -2.0 -10.2 . 7.4
1991-02-03 -5.3 0.8 -9.9 . 7.8
1991-02-04 -5.1 -2.3 -7.7 0.1 4 3.7
1991-02-05 -7.5 -4.4 -8.3 . 2.6
1991-02-06 -7.1 -2.2 -11.0 2.0 4 4.9
1991-02-07 -1.8 0.0 -2.7 2.7 4 0.0
1991-02-08 -1.8 0.4 -3.6 21.8 4 0.0
1991-02-09 0.8 2.0 -0.2 1.3 1 0.0
1991-02-10 1.6 3.4 -0.2 3.4 1 0.0
1991-02-11 0.7 2.5 -0.5 1.1 4 0.0
1991-02-12 -0.5 1.2 -1.0 4.7 4 0.0
1991-02-13 -2.0 -0.8 -2.6 0.0 4 0.0
1991-02-14 -1.8 1.4 -3.5 0.1 4 6.3
1991-02-15 -4.2 -0.8 -6.4 . 8.4
1991-02-16 -5.6 -2.4 -9.5 0.1 4 1.5
1991-02-17 -1.3 1.9 -3.8 . 8.3
1991-02-18 -1.3 4.5 -5.5 . 8.5
1991-02-19 -1.5 3.6 -4.7 . 5.8
1991-02-20 -1.4 4.7 -5.4 . 7.3
1991-02-21 1.0 6.1 -2.1 . 6.9
1991-02-22 4.1 10.1 0.5 . 3.2
1991-02-23 5.1 9.7 2.9 . 7.5
1991-02-24 6.0 8.6 5.5 0.0 1 1.8
1991-02-25 3.6 9.2 0.6 . 8.1
1991-02-26 3.9 9.3 1.2 . 2.9
1991-02-27 3.1 6.5 0.3 . 8.8
1991-02-28 1.4 5.3 -2.4 . 4.3
1991-03-01 1.7 3.5 -0.2 . 0.0
1991-03-02 2.4 3.3 1.7 0.8 4 0.0
1991-03-03 3.1 3.8 1.7 . 0.0
1991-03-04 4.3 6.2 2.7 . 1.5
1991-03-05 3.0 5.7 0.6 . 1.2
.........`
有人可以帮助我如何将其转换成数月。谢谢!
答案 0 :(得分:4)
从数字开始将表复制到内存中:
import pandas, bs4, requests, itertools, io
html = requests.get("http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/201j_EN.htm").text
soup = bs4.BeautifulSoup(html)
# the manual way:
# data = pandas.read_clipboard(names=["datum", "d_ta", "d_tx", "d_tn", "d_rs", "d_rf", "d_ss"], index_col='datum', parse_dates=['datum'])
# the automatic way:
table_html = '\n'.join(itertools.islice(map(lambda _: _.text, soup.find_all("pre")), 3, None))
data = pandas.read_table(io.StringIO(table_html), header=None, sep='\s+', index_col=0, parse_dates=0,
names=["datum", "d_ta", "d_tx", "d_tn", "d_rs", "d_rf", "d_ss"])
data.resample('m').mean()
当然,您可以使用除均值之外的其他聚合函数。输出:
d_ta d_tx d_tn d_rf d_ss
datum
1991-01-31 1.345161 4.609677 -0.574194 3.000000 1.583333
1991-02-28 -1.142857 2.592857 -3.639286 5.157143 1.516667
1991-03-31 8.158065 12.093548 5.141935 2.645161 0.775000
1991-04-30 9.920000 14.570000 6.510000 4.066667 4.450000
1991-05-31 13.396774 17.780645 9.738710 4.529032 4.280000
...