在r中抓取网站(openpaymentsdata.cms.gov)

时间:2017-09-26 09:46:19

标签: r web-scraping

我正在学习网站“https://openpaymentsdata.cms.gov/search/physicians?firstname=&lastname=&city=&state=&zip=&country=&specialty=”     用于构建数据库的数据。     在大多数情况下,我尝试使用rvest,XML,Rcurl包和方法html_nodes和XPATH来检索数据,但是,我无法从指定的URL中删除医师名称,专业,主要地址列表。

R代码:

url <-"https://openpaymentsdata.cms.gov/search/physicians?firstname=&lastname=&city=&state=&zip=&country=&specialty="
webpage<-read_html(url)
data_html <- html_nodes(webpage ,".a div")
data <-html_text(data_html)

   # I used CSS selector gadget for finding html_nodes

     [Refrence - Screenshot of the CSS selector been used in the image]

  [1]: https://i.stack.imgur.com/ZKaFM.png

data_html <- html_nodes(webpage ,".a div")
The above code returns that data_html is list of 0
Please guide how to scrape the list of doctors name from the specified url.

0 个答案:

没有答案