我正在尝试使用url_decode解码不同语言的大量网址(泰语/ viet / chinese)
编码后的网址如下所示:
click_search_search=Hanmyshop&by=pop&order=des
click_search_search=sp1114&by=pop&order=des
click_search_search=hanmyshop&by=pop&order=des
click_search_search=%C4%91%E1%BB%93ng%20h%E1%BB%93&by=pop&order=des
click_search_search=Sp1114&by=pop&order=des
click_search_search=nike&by=pop&order=des
click_search_search=%E4%BA%8C%E6%89%8B&by=pop&order=des
click_search_search=%E6%89%8B%E9%8C%B6&by=pop&order=des
click_search_search=%E5%BE%8C%E8%83%8C%E5%8C%85&by=pop&order=des
click_search_search=%E8%BF%AA%E5%A3%AB%E5%B0%BC&by=pop&order=des
click_search_search=iphone&by=pop&order=des
我在R中使用下面的代码来解码它们
url=as.vector(book1$Testing.URL)
de_url=url_decode(url)
Encoding(de_url)="UTF-8"
控制台中显示的结果
click_search_search=Hanmyshop&by=pop&order=des
click_search_search=sp1114&by=pop&order=des
click_search_search=hanmyshop&by=pop&order=des
click_search_search=đồng hồ&by=pop&order=des
click_search_search=Sp1114&by=pop&order=des
click_search_search=nike&by=pop&order=des
click_search_search=二手&by=pop&order=des
click_search_search=手錶&by=pop&order=des
click_search_search=後背包&by=pop&order=des
click_search_search=迪士尼&by=pop&order=des
click_search_search=iphone&by=pop&order=des
当我想在book1中添加一个名为“Decoded.URL”的单独列时,
book1$Decoded.URL=de_url
输入View(book1)
后,结果显示与控制台不同。越南语或中文的所有字符都以<“U + 1E3”>格式
我尝试使用write.table with fileEncoding="utf-8"
,没有帮助 - 中文字符显示正确;越南人不是。知道如何解决这个问题吗?