我正在尝试从下面的网址中获取Steam网站的价格和游戏名称列表,但无法弄清楚xpathSApply应如何解析以下内容:
http://store.steampowered.com/search/?sort_by=Price&sort_order=ASC&';">Price
这是我的代码
require(RCurl)
require(XML)
url <- "http://store.steampowered.com/search/results?sort_by=Name&sort_order=ASC&category1=1"
SOURCE <- getURL(url,encoding="UTF-8") #Download the page
substring (SOURCE,1,200)
PARSED <- htmlParse(SOURCE) #Format the html code
##My problem is in this line below
(xpathSApply(PARSED, "//div[@class='col search_price']"))
答案 0 :(得分:3)
试试这个:
require(RCurl)
require(XML)
url <- "http://store.steampowered.com/search/?sort_by=Metascore&sort_order=DESC&"
SOURCE <- getURL(url, encoding="UTF-8") #Download the page
PARSED <- htmlParse(SOURCE, asText = TRUE, encoding = "utf-8")
xpaths <- c(price="//a/div[@class='col search_price']",
title="//div[@class='col search_name ellipsis']/h4")
res <- sapply(xpaths, function(x) xpathSApply(PARSED, x, xmlValue, trim = TRUE) )
head(res)
# price title
# [1,] "9,99€" "Half-Life 2"
# [2,] "9,99€" "Half-Life"
# [3,] "19,99€" "BioShock™"
# [4,] "18,99€" "The Orange Box"
# [5,] "19,99€" "Portal 2"
# [6,] "14,99€" "The Elder Scrolls V: Skyrim"