我正在尝试使用R抓取网页,我可以抓取网页中的文本并将每个文本存储在一个文本文件中。
现在我想将每个评论存储在一个文本文件中,并访问该网站的第2页和第3页。我将不胜感激,
这是我的代码
addresses <- read.table("YY.csv") # Create a .csv file with all of the links you want to crawl
for (i in addresses) full.text <- getURL(i)
text.sub <- gsub("<.+?>", "", full.text) # Removes HTML tags
text <- data.frame(text.sub)
outpath <- "outputpath" #output path
x <- 1:nrow(text)
for(i in x) {
write(as.character(text[i,1]), file = paste(outpath,"/",i,".txt",sep="")) #write the text, each website per text file
}