在JavaScript中获取给定字符串的4种不同形式(初始,中间,最终或隔离)的阿拉伯语Unicode字符

时间:2016-12-28 16:50:39

标签: javascript unicode arabic

我有阿拉伯语内容,如ضضضضضضض。我想获得给定字符串中所有形式的字母(初始,中间,最终或隔离)的Unicode代码点。

1 个答案:

答案 0 :(得分:2)

Javascript库(不是我的)可以为您执行此操作:https://github.com/louy/Javascript-Arabic-Reshaper

这将使用仅使用'泛型'的字符串。字符并返回一个新字符串,其中包含为您完成的所有正确的位置特定替换。从那里,您可以在每个位置抓取字符代码(或代码点)。

以下是一个示例用法:

//import the library
var ArabicReshaper = require('arabic-reshaper');

// This can be a plain string. I just want to make sure I am feeding
// it the "plain" letter, not the initial/middle/end forms
var originalString = String.fromCharCode(0x0636, 0x0636); //ضض

// this will convert it to the 'shaped' letters. that means the letters
// will be transformed into the 'initial/middle/end' forms in the string
// (not just when it draws to the screen.
var newString = ArabicReshaper.convertArabic(originalString);

// And get the values. These will be the specific initial/middle/end values, not the generic ones
console.log(
    newString.codePointAt(0).toString(16), // outputs febf
    newString.codePointAt(1).toString(16) // outputs febe
);