将ISO-8859-1 charcodes转换为UTF-8

时间:2017-07-11 12:02:12

标签: php character-encoding

我有一个如下所示的输入字符串:

4BFC434845000000

输入字符串中的每两个字符代表ISO-8859-1中的十六进制代码。

  • 示例中的前两个字符(4B)代表数字4B 16 ,代表ISO-8859-1中的 K
  • 后两个字符(FC)代表FC 16 的数字,代表德语 u Umlaut ü )在ISO-8859-1。

上面的示例字符串表示Küche,这是厨房的德语单词。

输入字符串保证长度为16个字符,因此结果字符串的长度始终为8个字符。未使用的字符(如示例中所示)将为00

我知道我可以使用PHP中的iconv或其他函数将ISO-8859-1字符串转换为另一种字符编码。但是我不知道如何将ISO-8859-1 charcode(例如FC 16 或252 10 )转换为UTF-8字符。

当然,我可以使用关联数组将所有字符代码映射到它们代表的字符:

$table = array(
  0x4B => 'K',
  0xFC => 'ü',
  // ...
);

实现同样目标的最佳方法是什么?是否有PHP功能可以做到这一点?

1 个答案:

答案 0 :(得分:3)

这相当简单:将十六进制字符串转换为二进制,将ISO-8859二进制文件转换为UTF-8二进制文件:

if (byteUnderConsideration & Math.pow(2, (7 - bitIndexWithinByte))) return node.right

可选择在某些时候删除KBucket.prototype._determineNode = function (node, id, bitIndex) { // **NOTE** remember that id is a Buffer and has granularity of // bytes (8 bits), whereas the bitIndex is the _bit_ index (not byte) // id's that are too short are put in low bucket (1 byte = 8 bits) // parseInt(bitIndex / 8) finds how many bytes the bitIndex describes // bitIndex % 8 checks if we have extra bits beyond byte multiples // if number of bytes is <= no. of bytes described by bitIndex and there // are extra bits to consider, this means id has less bits than what // bitIndex describes, id therefore is too short, and will be put in low // bucket var bytesDescribedByBitIndex = ~~(bitIndex / 8) var bitIndexWithinByte = bitIndex % 8 if ((id.length <= bytesDescribedByBitIndex) && (bitIndexWithinByte !== 0)) return node.left var byteUnderConsideration = id[bytesDescribedByBitIndex] // byteUnderConsideration is an integer from 0 to 255 represented by 8 bits // where 255 is 11111111 and 0 is 00000000 // in order to find out whether the bit at bitIndexWithinByte is set // we construct Math.pow(2, (7 - bitIndexWithinByte)) which will consist // of all bits being 0, with only one bit set to 1 // for example, if bitIndexWithinByte is 3, we will construct 00010000 by // Math.pow(2, (7 - 3)) -> Math.pow(2, 4) -> 16 if (byteUnderConsideration & Math.pow(2, (7 - bitIndexWithinByte))) return node.right return node.left } 个字节。