如何使原始的unicode编码内容可读?

时间:2017-02-21 11:21:44

标签: json go unicode escaping

我使用net/http请求Web API,服务器返回了JSON响应。当我打印响应正文时,它显示为原始ASCII内容。我尝试使用bufio.ScanRunes来解析内容但失败了。

我还尝试编写一个简单的服务器并返回一个unicode字符串,它运行良好。

以下是核心代码:

func (c ClientInfo) Request(method string, url string, form url.Values) string {
    req, _ := http.NewRequest(method, url, strings.NewReader(c.Encode(form)))
    req.Header = c.Header
    req.AddCookie(&c.Cookie)
    resp, err := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    if err != nil {
        fmt.Println(err)
    }

    scanner := bufio.NewScanner(resp.Body)
    scanner.Split(bufio.ScanRunes)

    var buf bytes.Buffer
    for scanner.Scan() {
        buf.WriteString(scanner.Text())
    }
    rv := buf.String()
    fmt.Println(rv)
    return rv
}

以下是示例输出:

  

{ “论坛”:{ “ID”: “3251718”, “名称”: “\ u5408 \ u80a5 \ u5de5 \ u4e1a \ u5927 \ u5b66 \ u5ba3 \ u57ce \ u6821 \ u533a”, “first_class”:“\ u9ad8 \ u7b49 \ u9662 \ u6821" , “second_class”: “\ u5b89 \ u5fbd \ u9662 \ u6821”, “is_like”: “0”, “user_level”: “1”, “level_id”: “1”,“level_name “:” \ u7d20 \ u672a \ u8c0b \ u9762" , “cur_score”: “0”, “levelup_score”: “5”, “member_num”: “80329”, “is_exists”: “1”, “thread_num”:” 108762" , “post_num”: “3445881”, “good_classify”:[{ “类标识码”: “0”, “CLASS_NAME”: “\ u5168 \ u90e8”},{ “类标识码”: “1”, “CLASS_NAME”: “\ u516c \ u544a \ u7c7b”},{ “类标识码”: “2”, “CLASS_NAME”: “\ u5427 \ u53cb \ u4e13 \ u533a”},{ “类标识码”: “4”, “CLASS_NAME”:“\ u6d3b \ u52a8 \ u4e13 \ u533a “},{” 类标识码 “:” 6" , “CLASS_NAME”: “\ u793e \ u56e2 \ u73ed \ u7ea7”},{ “类标识码”: “5”, “CLASS_NAME”:“\ u8d44 \ u6e90 \ u5171 \ u4eab “},{” 类标识码 “:” 8" , “CLASS_NAME”: “\ u6e29 \ u99a8 \ u751f \ u6d3b \ u7c7b”},{ “类标识码”: “7”, “CLASS_NAME”: “\ u54a8 \ u8be2 \ u65b0 \ u95fb \ u7c7b”},{ “类标识码”: “3”, “CLASS_NAME”: “\ u98ce \ u91c7 \ u5c55 \ u793a \ u533a”}], “经理”:[{“编码“:” 793092593" , “名称”: “易\ u62b9 \ u660e \ u5a9a \ u7684 \ u5fe7 \ u4f24”},

     

...

1 个答案:

答案 0 :(得分:1)

这只是转义任何Unicode字符的标准方法。

解组它以查看未加引号的文本(json包将取消引用它):

func main() {
    var i interface{}
    err := json.Unmarshal([]byte(src), &i)
    fmt.Println(err, i)
}

const src = `{"forum":{"id":"3251718","name":"\u5408\u80a5\u5de5\u4e1a\u5927\u5b66\u5ba3\u57ce\u6821\u533a","first_class":"\u9ad8\u7b49\u9662\u6821","second_class":"\u5b89\u5fbd\u9662\u6821","is_like":"0","user_level":"1","level_id":"1","level_name":"\u7d20\u672a\u8c0b\u9762","cur_score":"0","levelup_score":"5","member_num":"80329","is_exists":"1","thread_num":"108762","post_num":"3445881","good_classify":[{"class_id":"0","class_name":"\u5168\u90e8"},{"class_id":"1","class_name":"\u516c\u544a\u7c7b"},{"class_id":"2","class_name":"\u5427\u53cb\u4e13\u533a"},{"class_id":"4","class_name":"\u6d3b\u52a8\u4e13\u533a"},{"class_id":"6","class_name":"\u793e\u56e2\u73ed\u7ea7"},{"class_id":"5","class_name":"\u8d44\u6e90\u5171\u4eab"},{"class_id":"8","class_name":"\u6e29\u99a8\u751f\u6d3b\u7c7b"},{"class_id":"7","class_name":"\u54a8\u8be2\u65b0\u95fb\u7c7b"},{"class_id":"3","class_name":"\u98ce\u91c7\u5c55\u793a\u533a"}]}}`

输出(修剪)(在Go Playground上尝试):

<nil> map[forum:map[levelup_score:5 is_exists:1 post_num:3445881 good_classify:[map[class_id:0 class_name:全部] map[class_id:1 class_name:公告类] map[class_id:2 class_name:吧友专区] map[class_id:4 class_name:活动专区] map[class_id:6 class_name:社团班级] map[class_id:5 class_name:资源共享] map[class_id:8 class_name:温馨生活类] map[class_name:咨询新闻类 class_id:7] map[class_id:3 class_name:风采展示区]] id:3251718 is_like:0 cur_score:0

如果您只想取消引用片段,可以使用strconv.Unquote()

fmt.Println(strconv.Unquote(`"\u7d20\u672a\u8c0b"`))

输出(在Go Playground上尝试):

素未谋 <nil>

请注意,strconv.Unquote()期望引号中的string,这就是我使用原始字符串文字的原因,因此我可以添加引号,以及编译器本身也不会解释/取消引用Unicode转义。

请参阅相关问题:How to convert escape characters in HTML tags?