为什么非ASCII字符不相等?

时间:2013-06-18 23:25:23

标签: ruby byte

我有两个测试用例,其中调用data_valid?方法。第一个返回false,第二个返回true,为什么?

    55: def data_valid? d
    56:   crc = d[-1]
    57:   data = d[1..-2]
    58:   len = d[0]
 => 59:   binding.pry
    60:   (data ^ len) == crc
    61: end

2.0.0 (#<MicroAeth::Message:0x007fbefc3ceae8>):0 > (data ^ len) == crc
=> false
2.0.0 (#<MicroAeth::Message:0x007fbefc3ceae8>):0 > (data ^ len)
=> "\xB1"
2.0.0 (#<MicroAeth::Message:0x007fbefc3ceae8>):0 > crc
=> "\xB1"
2.0.0 (#<MicroAeth::Message:0x007fbefc3ceae8>):0 > exit
have a good day!
F
From: /Users/rudolph9/Projects/CombustionEmissionsTesting/micro_aeth.rb @ line 59 MicroAeth::Message#data_valid?:

    55: def data_valid? d
    56:   crc = d[-1]
    57:   data = d[1..-2]
    58:   len = d[0]
 => 59:   binding.pry
    60:   (data ^ len) == crc
    61: end

2.0.0 (#<MicroAeth::Message:0x007fbefe83a8c8>):0 > (data ^ len) == crc
=> true
2.0.0 (#<MicroAeth::Message:0x007fbefe83a8c8>):0 > (data ^ len)
=> "+"
2.0.0 (#<MicroAeth::Message:0x007fbefe83a8c8>):0 > crc
=> "+"

以下是我对String类的扩展,我正在比较自定义XOR方法^的返回值。

  class ::String
    ###
    # @return the first charater in the string as an integer
    def byte
      self.bytes[0]
    end

    ### 
    # XOR two strings
    # @str assumed to be a one byte string or integer
    def ^ str
      if str.class == String
        str = str.byte
      elsif str.class == Fixnum
        nil
      else
        raise "invalid arg: #{str.class} \n Must be String or Fixnum"
      end
      self.bytes.each do |i|
        str = str ^ i
      end
      str.chr
    end
  end

我认为这与第一次比较非ASCII字符有关。如何正确设置条件?

1 个答案:

答案 0 :(得分:1)

您可以使用String#force_encoding强制将字符串强制转换为指定的编码

2.0.0-p195 :001 > "\xB1".encoding
 => #<Encoding:UTF-8> 
2.0.0-p195 :002 > eight_bit = "\xB1".force_encoding(Encoding::ASCII_8BIT)
 => "\xB1" 
2.0.0-p195 :003 > eight_bit.encoding
 => #<Encoding:ASCII-8BIT> 
2.0.0-p195 :004 > eight_bit == "\xB1"
 => false 
2.0.0-p195 :005 > eight_bit.force_encoding(Encoding::UTF_8) == "\xB1"
 => true
2.0.0-p195 :006 > eight_bit.force_encoding("\xB1".encoding) == "\xB1"
 => true

请注意,Ruby 2.0.0的默认编码是UTF-8