I'm trying to scrape all links on a svg map. I'm sorry I can't post the link because you'd need a login, but the html file goes like this:
...
<div class = 'header'>
<div class = 'headermenu'>
## only other place where there exists <a href> tabs
<a href = 'link_is_here'>...</a>
## four more of them, separated with non-breaking spaces
</div>
</div>
<div class = 'main'></div>
<svg parameters_of_graphic>
## paths for images
<a href='link_is_here' parameters_of_link>
## path for above link
</a>
## paths for images
<a href='link_is_here' parameters_of_link>
<circle parameters_of_circle></circle>
</a>
## multiple circle links of same format
</svg>
...
However, when I use home_url %>% read_html() %>% html_nodes('a')
, I only get the five nodes under the header class. I tried looking for svg scraping with rvest, but I couldn't find any way of scraping the nodes under the svg tab. Is there any way to do this in R?
答案 0 :(得分:0)
我不能确定没有看到实际的网页,但我怀疑svg是用javascript动态生成的。 read_html()
不会在页面上运行js,因此在读取页面时这些链接可能不存在。
您应该能够查看read_html()
返回的内容以确认这一点。