我正在尝试从此网页获取数据:http://www.finanzen.net/zertifikate/emittent/UBS/DERI(点击“Komplett”查看我尝试访问的完整历史记录)。
问题是它在源代码中找不到,但似乎是以交互方式创建的。
如何以机器可读的形式访问数据?
答案 0 :(得分:2)
library(seleniumPipes)
library(tidyverse)
# you need to figure out how to get selenium running and find the port
dr <- remoteDr("http://localhost", browserName="firefox", port="32772")
dr %>% go("http://www.finanzen.net/zertifikate/emittent/UBS/DERI")
# you will need to find a way to expand the slider range
keys <- dr %>% executeScript("return Object.keys(window.hschart1.series[0].data);")
keys <- unlist(keys)
# you have to iterate through the data array and return the individual key values
# since either Selenium or R can't convert the complex structure to a return value
map_df(keys, function(k) {
x <- dr %>% executeScript(sprintf("return window.hschart1.series[0].data[%s].x;", k))
y <- dr %>% executeScript(sprintf("return window.hschart1.series[0].data[%s].y;", k))
data_frame(x=anytime::anytime(x/1000), y=y)
}) -> df
df
## # A tibble: 213 × 2
## x y
## <dttm> <dbl>
## 1 2016-02-11 19:00:00 -1.791
## 2 2016-02-14 19:00:00 -1.684
## 3 2016-02-15 19:00:00 -1.586
## 4 2016-02-16 19:00:00 -1.344
## 5 2016-02-17 19:00:00 -1.392
## 6 2016-02-18 19:00:00 -1.327
## 7 2016-02-21 19:00:00 -1.129
## 8 2016-02-22 19:00:00 -1.271
## 9 2016-02-23 19:00:00 -1.315
## 10 2016-02-24 19:00:00 -1.218
## # ... with 203 more rows