我想在每小时更新预测时获取天文台数据。
我的一次数据提取代码如下。
library(RCurl)
web <- getURL("http://www.hko.gov.hk/contente.htm")
web <- unlist(strsplit(web, "\r\n"))
head(web)
temp <- unlist(strsplit(web[1245], "</span>"))
MINtemp <- vector()
MAXtemp <- vector()
for (i in 1:9){
mintemp <- substr(temp[2*i-1],
nchar(temp[2*i-1])-1,
nchar(temp[2*i-1]))
mintemp <- as.numeric(mintemp)
MINtemp <- append(MINtemp, mintemp)
maxtemp <- substr(temp[2*i],
nchar(temp[2*i])-1,
nchar(temp[2*i]))
maxtemp <- as.numeric(maxtemp)
MAXtemp <- append(MAXtemp, maxtemp)
}
status <- strsplit(
substring(web[1242],12),
"</a></td><td align")
status <- substring(unlist(status), 178)
weather <- vector()
for (i in 1:9){
status[i] <- unlist(strsplit(status[i], "width"))[1]
weather <- append(weather,
substr(status[i],
1,
nchar(status[i])-3
)
)
}
RH <- unlist(strsplit(web[1248], "</span>"))
MINRH <- vector()
MAXRH <- vector()
for (i in 1:9){
minRH <- substr(RH[2*i-1],
nchar(RH[2*i-1])-1,
nchar(RH[2*i-1]))
minRH <- as.numeric(minRH)
MINRH <- append(MINRH, minRH)
maxRH <- substr(RH[2*i],
nchar(RH[2*i])-1,
nchar(RH[2*i]))
maxRH <- as.numeric(maxRH)
MAXRH <- append(MAXRH, maxRH)
}
forecast <- paste("+", 1:9, "day(s)", sep=" ")
current <- as.character(rep(Sys.time(),9))
DATA <- data.frame(cbind(current,forecast,MINtemp, MAXtemp, MINRH, MAXRH, weather))
DATA
我得到的数据是
> DATA
current forecast MINtemp MAXtemp MINRH MAXRH weather
1 2014-05-04 08:37:55 + 1 day(s) 21 25 80 95 Cloudy with a few showers and thunderstorms. Showers will be more frequent later
2 2014-05-04 08:37:55 + 2 day(s) 22 25 75 90 Cloudy with showers. A few squally thunderstorms at first
3 2014-05-04 08:37:55 + 3 day(s) 21 24 75 95 Cloudy with a few showers
4 2014-05-04 08:37:55 + 4 day(s) 22 25 80 95 Cloudy with a few showers
5 2014-05-04 08:37:55 + 5 day(s) 23 26 80 95 Cloudy with showers and a few squally thunderstorms
6 2014-05-04 08:37:55 + 6 day(s) 23 26 80 95 Cloudy with showers. Showers will be heavy at times with squally thunderstorms
7 2014-05-04 08:37:55 + 7 day(s) 22 25 80 95 Cloudy with showers and squally thunderstorms
8 2014-05-04 08:37:55 + 8 day(s) 22 25 70 95 Mainly cloudy with a few showers
9 2014-05-04 08:37:55 + 9 day(s) 22 26 70 90 Mainly cloudy
我希望R脚本每小时运行一次。然后使用rbind(DATA, data)
累积数据集。我使用CMD R BATCH搜索类似的主题。我可以在R中使用Sys.sleep()
和while(substr(Sys.time(), 15,16)=="00")
吗?
我搜索过调度This Link
的类似任务我在目录C:\Program Files\R\R-3.0.2\bin\Rscript.exe
我将我的Rscipt保存在D:\mydocument\test.r
虽然我还不清楚如何完成任务。
答案 0 :(得分:9)
你可能会使用Sys.sleep()
但smells like bad code。
相反,设置cron job以每小时更新一次代码。然后你的脚本很简单,而且更健壮。
答案 1 :(得分:3)