以下输出是将R数据帧保存为Json格式时生成的输出。我的数据框中混合了html链接和一些重音字符。我必须在PHP / Html环境中使用这个文件。
library(jsonlite)
output_json <- toJSON(output, dataframe = "rows", pretty = T)
write(output_json, file = "output.txt")
{
"PMID":"<a href= \"http://www.ncbi.nlm.nih.gov/pubmed/?term=19369233\"
target=\"_blank\">19369233</a>",
"Title":"Delayed achievement of cytogenetic and molecular response is
associated with increased risk of progression among patients with
chronic myeloid leukemia in early chronic phase receiving
high-dose or standard-dose imatinib therapy.",
"Author":"Quintás-Cardama A",
"Random author names":"Järås M", "Imrédi E", "Tímár J."
},
当我在html页面上打开output.txt
文件或打印输出时,第一作者和最后一位作者的重音字母更改为?
,例如:Imr�di E
。
当我使用下面的PHP代码解码来读取json文件时,它会失败并返回NULL。在SO的研究中,我确定问题来自重音字符,并且在某些情况下,不正确地转义新行\r\n
或html标记。
!-- language: lang-php -->
$r_output = file_get_contents('output.txt');
$array_json = json_decode($r_output, true);
我尝试通过以下建议来修复:How do I handle newlines in JSON?或PHP json_decode() returns NULL with valid JSON?等。但是,无法解决此问题。
因此,标记PHP和R用户,以找出是否有更好的方法在R中编写JSON格式以避免此问题或清除json格式之前在php中读取它?
感谢您的帮助
答案 0 :(得分:2)
尝试utf8_encode
$r_output
并删除换行符,即:
$r_output = utf8_encode(file_get_contents('output.txt'));
$r_output = preg_replace("/[\n\r]/","",$r_output);
$array_json = json_decode($r_output, true);
或者尝试utf8_decode
:
$r_output = utf8_decode(file_get_contents('output.txt'));
$r_output = preg_replace("/[\n\r]/","",$r_output);
$array_json = json_decode($r_output, true);
PS:你的json似乎无效 - &gt; "Imrédi E", Tímár J."
答案 1 :(得分:1)
首先写下output file as UTF-8:
library(jsonlite)
output_json <- toJSON(output, dataframe = "rows", pretty = T)
con<-file("output.txt", encoding="UTF-8")
write(output_json, file = con)