我有1000条记录,其中包含emailaddress和所有其他地址信息。我希望从这个网站[https://www.melissadata.com/lookups/businesscoder.asp][1]获取每条记录的信息。有没有办法自动化这个过程。
答案 0 :(得分:0)
这是一个关于如何从网站中提取每个链接的三个例子:
# r library for making requests
library(httr)
# r library for parsing XML and HTML
library(XML)
# performing GET request to website
response <- GET("https://www.melissadata.com/lookups/index.htm", encoding="UTF-8")
# parse response as html in order to run xpath queries
parsedoc <- htmlParse(response)
# perform xpath search query on parsed document
links <- xpathSApply(parsedoc, "//a", xmlGetAttr, "href")
要进行网页搜索,您应该了解xpath查询:https://www.w3schools.com/xml/xpath_intro.asp