我已经搜索了很多可靠的搜索帖,但找不到像我这样的例子。我跟随了listgadget的R插图示例(https://blog.rstudio.com/2014/11/24/rvest-easy-web-scraping-with-r/),但是根据需要输入了我的用例。选择器小工具的建议都没有让我得到我需要的东西。我需要在页面上提取每个评论的名称。该名称在幕后的样本如下:
<span itemprop="name" class="sg_selected">This Name</span>
这是我的代码。理想情况下,此代码应该为我提供此网页上的个人名称。
library(rvest)
library(dplyr)
dsa_reviews <-
read_html("https://www.directsalesaid.com/companies/traveling-
vineyard#reviews")
review_names <- html_nodes(dsa_reviews,'#reviews span')
df <- bind_rows(lapply(xml_attrs(review_names), function(x)
data.frame(as.list(x), stringsAsFactors=FALSE)))
如果这是重复的问题或者格式不正确,请道歉。请随时请求任何必要的修改。
答案 0 :(得分:3)
这是:
library(rvest)
library(dplyr)
dsa_reviews <-
read_html("https://www.directsalesaid.com/companies/traveling-vineyard#reviews")
html_nodes(dsa_reviews,'[itemprop=name]') %>%
html_text()
[1] "Traveling Vineyard" ""
[3] "Kiersten Ray-kuhn" "Miley Sama"
[5] " Nancy Shawtone " "Amanda Moore"
[7] "Matt" "Kathy Barzal"
[9] "Lesa Brinker" "Lori Stryker"
[11] "Jeanette Holtman" "Penny Notarnicola"
[13] "Laura Ann" "Nicole Lafave"
[15] "Gretchen Hess Miller" "Gina Devine"
[17] "Ashley Lawton Converse" "Morgan Williams"
[19] "Angela Baston Mckeone" "Traci Feshler"
[21] "Kisha Marshall Dlugos" "Jody Cole Dvorak"
科林