Question

我基本上试图浏览带有日文字母的网址。从昨天开始，这个问题就在我的first question上了。我的代码现在生成了正确的URL，如果我只是获取URL并将其放入浏览器中，我会得到正确的结果，但如果我尝试通过集成browseURL()来自动化该过程，则会得到错误的结果。

E.g。我想调用以下网址：

http://www.google.com/trends/trendsReport?hl=en-US&q=VWゴルフ％2B VWポロ％2B VWパサート％2B大众日期= 1％2F2010 68m＆amp; cmpt = q＆amp; content = 1＆amp; export = 1

如果我现在使用

browseURL(http://www.google.com/trends/trendsReport?hl=en-US&q=VWゴルフ %2B VWポロ %2B VWパサート %2B VWティグアン&date=1%2F2010 68m&cmpt=q&content=1&export=1)

我可以在浏览器中看到它浏览了

www.google.com/trends/trendsReport?hl=en-US&q=VW%E3%83%BB%EF%BD%BDS%E3%83%BB%EF%BD%BD%E3%83%BB%EF%BD%BD%E3%83%BB%EF%BD%BDt%20%2B%20VW%E3%83%BB%EF%BD%BD%7C%E3%83%BB%EF%BD%BD%E3%83%BB%EF%BD%BD%20%2B%20VW%E3%83%BB%EF%BD%BDp%E3%83%BB%EF%BD%BDT%E3%83%BB%EF%BD%BD[%E3%83%BB%EF%BD%BDg%20%2B%20VW%E3%83%BB%EF%BD%BDe%E3%83%BB%EF%BD%BDB%E3%83%BB%EF%BD%BDO%E3%83%BB%EF%BD%BDA%E3%83%BB%EF%BD%BD%E3%83%BB%EF%BD%BD&date=1%2F2010%2068m&cmpt=q&content=1&export=1

这似乎是编码错误。我已经尝试了

browseURL(URL, encodeIfNeeded=TRUE)

但这似乎并没有改变一件事，只要我解释它的功能它也不应该因为这个函数是为了生成那些＆＃34;％B＆＃34;字母，即使在encodeIfNeeded = FALSE时，我也更加惊讶。

非常感谢任何帮助！

> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 8 (build 9200)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=Japanese_Japan.932           LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.2.1

Answer 1

我认为这将解决这个问题：

library(httr)
library(curl)

gt_url <- "http://www.google.com/trends/trendsReport?hl=en-US&q=VWゴルフ %2B VWポロ %2B VWパサート %2B VWティグアン&date=1%2F2010 68m&cmpt=q&content=1&export=1"

# ensure the %2B's aren't getting in the way then
# ask httr to carve up the url and put it back together
parts <- parse_url(URLdecode(gt_url))
browseURL(build_url(parts))

这会给this（粘贴时间太长但我想确保OP能够看到整个内容）。

我现在也明白为什么你必须这样做（download.file和GET write_disk由于javascript重定向而无效。

如何使用browseURL（）获得正确的编码？

1 个答案: