我正在尝试使用数据来编写URL并从网站上抓取数据。 我的函数在github上:https://github.com/blaquans/RInsee/blob/master/R/inflation.R。
它适用于较小的数字,但它不能用于大数字:
> inflation(input = 4000, input.currency = "Euro", input.year = 2004, output.currency = "Euro", output.year = 2012)
http://www.insee.fr/fr/themes/calcul-pouvoir-achat.asp?sommeDepart=4000&deviseDepart=Euro&anneeDepart=2004&deviseArrivee=Euro&anneeArrivee=2012
[1] 4569.57
> inflation(input = 400000, input.currency = "Euro", input.year = 2004, output.currency = "Euro", output.year = 2012)
ERROR
http://www.insee.fr/fr/themes/calcul-pouvoir-achat.asp?sommeDepart=4e+05&deviseDepart=Euro&anneeDepart=2004&deviseArrivee=Euro&anneeArrivee=2012
[1] NA
原因是R在4e + 06中转换400000并且没有写出好的URL。如何强制R写入400000而不是4e + 06?
答案 0 :(得分:1)
使用options(scipen =)
inflation <- function(input, input.currency, input.year, output.currency, output.year) {
oldscipen <- options('scipen')$scipen
options(scipen = 999)
require("RCurl")
tx <- getURL(paste("http://www.insee.fr/fr/themes/calcul-pouvoir-achat.asp?sommeDepart=",input, "&deviseDepart=",input.currency,"&anneeDepart=",input.year, "&deviseArrivee=",output.currency, "&anneeArrivee=",output.year, sep = ""))
patrick <- ".*<strong class=\"resultat\">([[:digit:][:blank:],]+)[[:blank:]eurosfracin]+</strong>.*"
if (grepl(pattern = patrick, x = tx) == TRUE){
out <- sub(pattern = patrick , replacement = "\\1", x = tx)
out <- as.numeric(gsub( pattern = "[[:blank:]]", replacement = "", x = gsub(pattern = ",", replacement = ".", x = out)))
}
else {
cat("ERROR \n")
out <- NA
}
options(scipen = oldscipen)
return(out)
}
测试你的例子:
> inflation(input = 4000, input.currency = "Euro", input.year = 2004, output.currency = "Euro", output.year = 2012)
Loading required package: RCurl
Loading required package: bitops
[1] 4569.57
>
> inflation(input = 400000, input.currency = "Euro", input.year = 2004, output.currency = "Euro", output.year = 2012)
[1] 456956.5