Question

我需要伊利诺伊州芝加哥每小时的历史天气数据（温度）（邮政编码60603）

基本上我需要2017年6月和7月的每小时或15分钟的间隔时间。

我在NOAA，地下天气等进行了搜索。但是没有找到与我的用例相关的任何内容。尝试使用R和Python进行刮擦，但没有运气。

以下是相同的代码段

R：

library(httr)
library(XML)
url <- "http://graphical.weather.gov/xml/sample_products/browser_interface/ndfdXMLclient.php"
response <- GET(url,query=list(zipCodeList="10001",
                           product="time-series",
                           begin=format(Sys.Date(),"%Y-%m-%d"),
                           Unit="e",
                           temp="temp",rh="rh",wspd="wspd"))
doc   <- content(response,type="text/xml", encoding = "UTF-8")   # XML document with the data
# extract the date-times
dates <- doc["//time-layout/start-valid-time"]
dates <- as.POSIXct(xmlSApply(dates,xmlValue),format="%Y-%m-%dT%H:%M:%S")
# extract the actual data
data   <- doc["//parameters/*"]
data   <- sapply(data,function(d)removeChildren(d,kids=list("name")))
result <- do.call(data.frame,lapply(data,function(d)xmlSApply(d,xmlValue)))
colnames(result) <- sapply(data,xmlName)
# combine into a data frame
result <- data.frame(dates,result)
head(result)

错误：

Error in UseMethod("xmlSApply") : 
no applicable method for 'xmlSApply' applied to an object of class "list"

Python：

from pydap.client import open_url

# setup the connection
url = 'http://nomads.ncdc.noaa.gov/dods/NCEP_NARR_DAILY/197901/197901/narr-
a_221_197901dd_hh00_000'
modelconn = open_url(url)
tmp2m = modelconn['tmp2m']

# grab the data
lat_index = 200    # you could tie this to tmp2m.lat[:]
lon_index = 200    # you could tie this to tmp2m.lon[:]
print(tmp2m.array[:,lat_index,lon_index] )

错误：

 HTTPError: 503 Service Temporarily Unavailable

R或Python或任何相关的在线数据集链接都赞赏任何其他解决方案

Answer 1

有一个R套餐rwunderground，但我没有取得太大的成功，从而得到我想要的东西。老实说，我不确定这是不是包裹，或者是不是我。

最后，我打破了并写了一个快速的伙伴，以获取个人气象站的每日天气历史。您需要注册一个Weather Underground API令牌（我会留给您）。然后您可以使用以下内容：

library(rjson)

api_key <- "your_key_here"
date <- seq(as.Date("2017-06-01"), as.Date("2017-07-31"), by = 1)
pws <- "KILCHICA403"

Weather <- vector("list", length = length(date))

for(i in seq_along(Weather)){
  url <- paste0("http://api.wunderground.com/api/", api_key,
                "/history_", format(date[i], format = "%Y%m%d"), "/q/pws:",
                pws, ".json")
  result <- rjson::fromJSON(paste0(readLines(url), collapse = " "))
  Weather[[i]] <- do.call("rbind", lapply(result[[2]][[3]], as.data.frame, 
                                          stringsAsFactors = FALSE))
  Sys.sleep(6)
}

Weather <- do.call("rbind", Weather)

对Sys.sleep的调用，导致循环在进入下一次迭代之前等待6秒。之所以这样做是因为免费API每分钟只允许10次通话（每天最多500次）。

此外，有些日子可能没有数据。请记住，这连接到个人气象站。可能有多种原因导致数据停止上传，包括互联网停电，停电或所有者关闭了Weather Underground的链接。如果您无法从一个工作站获取数据，请尝试附近的其他工作并填补空白。

要获取气象站代码，请访问weatherunderground.com。在搜索栏中输入您想要的邮政编码

点击＆＃34;更改＆＃34;链路

您可以看到当前电台的电台代码，以及附近其他电台的选项。

Answer 2

只是为这个问题的任何人提供一个python解决方案。这将（根据帖子）在2017年6月和7月的每一天进行，获取给定位置的所有观察结果。这不限于15分钟或每小时，但确实提供当天观察到的所有数据。每次观察的观察时间的额外解析是必要的，但这是一个开始。

WunderWeather

pip install WunderWeather pip install arrow

import arrow # learn more: https://python.org/pypi/arrow
from WunderWeather import weather # learn more: https://python.org/pypi/WunderWeather

api_key = ''
extractor = weather.Extract(api_key)
zip = '02481'

begin_date = arrow.get("201706","YYYYMM")
end_date = arrow.get("201708","YYYYMM").shift(days=-1)
for date in arrow.Arrow.range('day',begin_date,end_date):
  # get date object for feature
  # http://wunderweather.readthedocs.io/en/latest/WunderWeather.html#WunderWeather.weather.Extract.date
  date_weather = extractor.date(zip,date.format('YYYYMMDD'))

  # use shortcut to get observations and data
  # http://wunderweather.readthedocs.io/en/latest/WunderWeather.html#WunderWeather.date.Observation
  for observation in date_weather.observations:
    print("Date:",observation.date_pretty)
    print("Temp:",observation.temp_f)

如何获取伊利诺伊州芝加哥市的每小时历史天气数据（温度）

2 个答案: