我有一个如下所示的输入字符串:
4BFC434845000000
输入字符串中的每两个字符代表ISO-8859-1中的十六进制代码。
4B
)代表数字4B 16 ,代表ISO-8859-1中的 K 。FC
)代表FC 16 的数字,代表德语 u Umlaut (ü )在ISO-8859-1。上面的示例字符串表示Küche,这是厨房的德语单词。
输入字符串保证长度为16个字符,因此结果字符串的长度始终为8个字符。未使用的字符(如示例中所示)将为00
。
我知道我可以使用PHP中的iconv
或其他函数将ISO-8859-1字符串转换为另一种字符编码。但是我不知道如何将ISO-8859-1 charcode(例如FC 16 或252 10 )转换为UTF-8字符。
当然,我可以使用关联数组将所有字符代码映射到它们代表的字符:
$table = array(
0x4B => 'K',
0xFC => 'ü',
// ...
);
实现同样目标的最佳方法是什么?是否有PHP功能可以做到这一点?
答案 0 :(得分:3)
这相当简单:将十六进制字符串转换为二进制,将ISO-8859二进制文件转换为UTF-8二进制文件:
if (byteUnderConsideration & Math.pow(2, (7 - bitIndexWithinByte))) return node.right
可选择在某些时候删除KBucket.prototype._determineNode = function (node, id, bitIndex) {
// **NOTE** remember that id is a Buffer and has granularity of
// bytes (8 bits), whereas the bitIndex is the _bit_ index (not byte)
// id's that are too short are put in low bucket (1 byte = 8 bits)
// parseInt(bitIndex / 8) finds how many bytes the bitIndex describes
// bitIndex % 8 checks if we have extra bits beyond byte multiples
// if number of bytes is <= no. of bytes described by bitIndex and there
// are extra bits to consider, this means id has less bits than what
// bitIndex describes, id therefore is too short, and will be put in low
// bucket
var bytesDescribedByBitIndex = ~~(bitIndex / 8)
var bitIndexWithinByte = bitIndex % 8
if ((id.length <= bytesDescribedByBitIndex) && (bitIndexWithinByte !== 0)) return node.left
var byteUnderConsideration = id[bytesDescribedByBitIndex]
// byteUnderConsideration is an integer from 0 to 255 represented by 8 bits
// where 255 is 11111111 and 0 is 00000000
// in order to find out whether the bit at bitIndexWithinByte is set
// we construct Math.pow(2, (7 - bitIndexWithinByte)) which will consist
// of all bits being 0, with only one bit set to 1
// for example, if bitIndexWithinByte is 3, we will construct 00010000 by
// Math.pow(2, (7 - 3)) -> Math.pow(2, 4) -> 16
if (byteUnderConsideration & Math.pow(2, (7 - bitIndexWithinByte))) return node.right
return node.left
}
个字节。