我有阿拉伯语内容,如ضضضضضضض。我想获得给定字符串中所有形式的字母(初始,中间,最终或隔离)的Unicode代码点。
答案 0 :(得分:2)
Javascript库(不是我的)可以为您执行此操作:https://github.com/louy/Javascript-Arabic-Reshaper
这将使用仅使用'泛型'的字符串。字符并返回一个新字符串,其中包含为您完成的所有正确的位置特定替换。从那里,您可以在每个位置抓取字符代码(或代码点)。
以下是一个示例用法:
//import the library
var ArabicReshaper = require('arabic-reshaper');
// This can be a plain string. I just want to make sure I am feeding
// it the "plain" letter, not the initial/middle/end forms
var originalString = String.fromCharCode(0x0636, 0x0636); //ضض
// this will convert it to the 'shaped' letters. that means the letters
// will be transformed into the 'initial/middle/end' forms in the string
// (not just when it draws to the screen.
var newString = ArabicReshaper.convertArabic(originalString);
// And get the values. These will be the specific initial/middle/end values, not the generic ones
console.log(
newString.codePointAt(0).toString(16), // outputs febf
newString.codePointAt(1).toString(16) // outputs febe
);