Question

我想从url下载图像： target fig link xpath in chrome https://tophatter.com/lots/104461372

提取：

https://images.tophatter.com/42c09f609e7a6a47c70e0e1ccf3a0bb6/large.jpg

在xpath无效的情况下：div [class ='col-md-7 slot-images'] img

在Chrome浏览器中>检查>单击大图，Xpath显示在：   // * [@@ id =“ lot-modal-content”] / div 1 / img

它在xml正文部分中，在RVest教程中不起作用

library(rvest)
library(downloader)
library(dplyr)

url <- "https://tophatter.com/lots/104461372"
doc <- read_html(url)
doc <- xml2::read_html(url)

doc %>% html_nodes("div.col-md-7") %>% html_attr("class")
doc %>% html_nodes("div.col-md-7") %>% html_attr("src")

下面是返回 'col-md-7 slot-images' 不适用

Answer 1

这是我的解决方案，经过反复试验，我在头部找到了目标jpg网址

a = doc %>% html_nodes("meta") %>% html_attrs
a = doc %>% html_nodes("meta") %>% html_attr("content") %>% na.omit
index = a %>% stringr::str_detect(".jpg") %>% which
a[index]

result screen shotcut

从Chrome浏览器访问Web xpath的rvest下载无花果，但是chrome文件头中的为什么会起作用？

1 个答案: