我正在尝试将多个字符串从本地语言(“泰米尔语”)翻译成英语。我使用了Stackoverflow的示例代码并修改了脚本:
library(RCurl)
library(XML)
library(Unicode)
getParam = read.csv("file.csv", header=T, encoding = "UTF-8")
get<-u_to_lower_case(getParam$text)
get1<-matrix(1:6,nrow = 6, ncol = 1)
get1<-(as.vector(get))
data2<-matrix(1:6,nrow=6, ncol = 1)
translateFrom = "ta"
translateTo = "en"
for (i in 1:length(get1))
{
URL <- paste("https://translate.google.pl?
hl=",translateTo,"&sl=",translateFrom,"&tl=",translateTo,"&ie=UTF-
8&prev=_m&q=",get1[i],sep="")
print(URL)
page <- getURL(URL)
tree <-htmlTreeParse(page)
body <- tree$children$html$children$body
body_text <- body$children[[5]]$children[[1]]
print(body_text)
data1[i]<-body_text
if(nrow(data1) == 0)
{data1[1,] <- "N/A"
}
data2<-rbind(data2,data1)
}
没有循环,我可以翻译它。但是,在循环中,我得到了这样的错误:
In structure(x$children, class = "XMLNodeList") :
Calling 'structure(NULL, *)' is deprecated, as NULL cannot have
attributes.
Consider 'structure(list(), *)' instead.
有人可以更正我的代码或提出更好的代码吗?
注意:为了避免陷入Unicode错误,进行了多次修改。但是,终于克服了它。 另外,我不想使用付费的Google翻译API。