有没有办法用命令+ a来突出显示Rselenium页面上的所有文本?或者另一种突出显示页面上所有文字的方法?

时间:2017-07-16 00:11:15

标签: r web-scraping rselenium

也许像这样的sendkeystoactiveelement函数?

sendKeysToActiveElement(   list(key ='command_meta',“U + 0074”) )

U + 0074是小写“a”

的UTF代码

2 个答案:

答案 0 :(得分:1)

您可能会感兴趣this article from zevross.com。它将以下代码作为“预览”发布在页面上

# Sneak preview of code for interacting with a web page with RSelenium
# a proper blog post with explanation will follow.

library(RSelenium)
# make sure you have the server
checkForServer()

# use default server 
startServer()
remDr<-remoteDriver$new()


# send request to server
url<-"https://programs.iowadnr.gov/animalfeedingoperations/FacilitySearch.aspx?Page=0"
remDr$open(silent = TRUE) #opens a browser
remDr$navigate(url)


# identify search button and click
searchID<-'//*[@id="ctl00_foPageContent_SearchButton"]'
webElem<-remDr$findElement(value = searchID)
webElem$clickElement()

# identify the table
tableID<-'//*[@id="ctl00_foPageContent_Panel1"]/div[2]/table'
webElem<-remDr$findElement(value = tableID)

doc<-htmlParse(remDr$getPageSource()[[1]])

tabledat<-readHTMLTable(doc)[[17]]
tabledat[,]<-lapply(tabledat[,],
    function(x) gsub("ÃÂ", "", as.character(x)))
tabledat<-tabledat[-nrow(tabledat),-1]
# go to next page
nextID<-'//*[@id="ctl00_foPageContent_FacilitySearchRepeater_ctl11_PagePlus1"]'
webElem<-remDr$findElement(value = nextID)
webElem$clickElement()

此代码作为提取数据的函数(随后映射):

# FUNCTION from help for chartr 
capwords<-function(s, strict = FALSE) {
  cap<-function(s) paste(toupper(substring(s, 1, 1)),
        {s<-substring(s, 2); if(strict) tolower(s) else s},
        sep = "", collapse = " " )
    sapply(strsplit(s, split = " "),
        cap, USE.NAMES = !is.null(names(s)))
}
# ---------------------------------------


full<-mutate(geocodes, name=fnames) %>%
  mutate(category=ifelse(grepl("Winery", name), 1, 2)) %>%
  mutate(addressUse=gsub("Ny", "NY", capwords(gsub(", usa", "", address)))) %>%
  mutate(street=sapply(strsplit(addressUse, ","), "[[", 1)) %>%
  mutate(city=sapply(strsplit(addressUse, ","), "[[", 2)) %>%
  filter(!grepl('Interlaken|Ithaca|Aurora|Seneca Falls', street)) %>%
  select(name, street, city, category, lat, lon)

head(full)

This article by Jim Plantethis one also by John Harrison值得一看。

答案 1 :(得分:1)

正如雷切尔所说,你可以使用她提供的一些链接中概述的按键。您可以将按键发送到元素(html标记)。 body html标记可用于发送到页面:

library(RSelenium)
rD <- rsDriver()
appURL <- "https://stackoverflow.com/questions/45123833/is-there-a-way-to-do-commanda-to-highlight-all-text-on-a-page-with-rselenium-o/45123917#45123917"
remDr <- rD$client
remDr$navigate(appURL)
# select the page 
bElem <- remDr$findElement("css", "body")
# send key press to page
bElem$sendKeysToElement(list(key = "control", "a"))
remDr$screenshot(display = TRUE)

# cleanup
rm(rD)
gc()

命令键的unicode值为&#39; \ ue03d&#39;。根据RSelenium中的特殊键检查:

sapply(selKeys, function(x) identical(x, '\ue03d'))

显示命令键被引用为command_meta所以在MAC(未经测试)中你可以使用:

bElem$sendKeysToElement(list(key = "command_meta", "a"))

在上面。