我正在尝试从以下位置抓取数据 https://www.snowyhydro.com.au/our-energy/water/storages/lake-levels-calculator/ Iam尝试使用R的下拉菜单进一步刮擦不同年份的湖泊水位。 在Iam在网上搜索各种代码的过程中Iam艰难地从哪里开始的那一刻,Iam无法获得关于如何使用R获取不同湖泊和Iam的年度价值的起点。
我在这里尝试使用选择器小工具,但由于我认为图表是基于Java的,因此无法正常工作
library('rvest')
url <- 'https://www.snowyhydro.com.au/our-energy/water/storages/lake-levels-calculator/'
webpage <- read_html(url)
我正在寻找他所有湖泊的每日存储水平的表格结果。
答案 0 :(得分:0)
我能够找到一个更好的网址来请求数据:"https://www.snowyhydro.com.au/wp-content/themes/basic/get_dataxml.php
该请求的JSON响应没有明确地解释为一个表,但是我认为这里的功能应该可以为您完成此操作:
library(httr)
library(jsonlite)
# This function is called from within the other to convert each day
# to its own dataframe, creating extra columns for the year, month, and day
entry.to.row <- function(entry) {
date = entry[["-date"]]
entry.df = data.frame(
matrix(unlist(entry$lake), nrow=length(entry$lake), byrow = T),
stringsAsFactors = F
)
colnames(entry.df) = c("LakeName", "Date","Measurement")
entry.df$Date = date
date.split = strsplit(date, split = "-")[[1]]
entry.df$Year = date.split[1]
entry.df$Month = date.split[2]
entry.df$Day = date.split[3]
entry.df
}
# Fetch the data for two years and convert them into two data.frames which
# we will then merge into a single data.frame
fetch.data <- function(
base.url = "https://www.snowyhydro.com.au/wp-content/themes/basic/get_dataxml.php",
current,
past
) {
fetched = httr::POST(
url = base.url,
body = list("year_current"=current, "year_pass"=past)
)
datJSON = fromJSON(content(fetched, as = "text"), simplifyVector = F)
pastJSON = datJSON$year_pass$snowyhydro$level
pastEntries = do.call("rbind", lapply(pastJSON, entry.to.row))
currentJSON = datJSON$year_current$snowyhydro$level
currentEntries = do.call("rbind", lapply(currentJSON, entry.to.row))
rbind(pastEntries, currentEntries)
}
# Fetch the data for 2019 and 2018
dat = fetch.data(current=2019, past=2018)
> head(dat)
LakeName Date Measurement Year Month Day
1 Lake Eucumbene 2018-01-01 46.40 2018 01 01
2 Lake Jindabyne 2018-01-01 85.80 2018 01 01
3 Tantangara Reservoir 2018-01-01 42.94 2018 01 01
4 Lake Eucumbene 2018-01-02 46.41 2018 01 02
5 Lake Jindabyne 2018-01-02 85.72 2018 01 02
6 Tantangara Reservoir 2018-01-02 42.98 2018 01 02