我正在尝试将代码从python移植到ruby,并且在将UTF-8字符串编码为JSON的其中一个函数中遇到困难。
我已将代码删除到我认为是我的问题。
我想让ruby输出与python完全相同的输出。
#!/usr/bin/env python
# encoding: utf-8
import json
import hashlib
text = "ÀÈG"
js = json.dumps( { 'data': text } )
print 'Python:'
print js
print hashlib.sha256(js).hexdigest()
#!/usr/bin/env ruby
require 'json'
require 'digest'
text = "ÀÈG"
obj = {'data': text}
# js = obj.to_json # not using this, in order to get the space below
js = %Q[{"data": "#{text}"}]
puts 'Ruby:'
puts js
puts Digest::SHA256.hexdigest js
$ ./test.rb && ./test.py
Ruby:
{"data": "ÀÈG"}
6cbe518180308038557d28ecbd53af66681afc59aacfbd23198397d22669170e
Python:
{"data": "\u00c0\u00c8G"}
a6366cbd6750dc25ceba65dce8fe01f283b52ad189f2b54ba1bfb39c7a0b96d3
我需要更改ruby代码以使其输出与python输出相同(至少是最终散列)?
注意:
答案 0 :(得分:1)
当然有人会提出更优雅(或至少更有效和更强大)的解决方案,但目前只有一个:
#!/usr/bin/env ruby
require 'json'
require 'digest'
text = 'ÀÈG'
.encode('UTF-16') # convert UTF-8 characters to UTF-16
.inspect # escape UTF-16 characters and convert back to UTF-8
.sub(/^"\\u[Ff][Ee][Ff][Ff](.*?)"$/, '\1') # remove outer quotes and BOM
.gsub(/\\u\w{4}/, &:downcase!) # downcase alphas in escape sequences
js = { data: text } # wrap in containing data structure
.to_json(:space=>' ') # convert to JSON with spaces after colons
.gsub(/\\\\u(?=\w{4})/, '\\u') # remove extra backslashes
puts 'Ruby:', js, Digest::SHA256.hexdigest(js)
输出:
$ ./test.rb
Ruby:
{"data": "\u00c0\u00c8G"}
a6366cbd6750dc25ceba65dce8fe01f283b52ad189f2b54ba1bfb39c7a0b96d3