在Dart中处理字素簇

时间:2019-02-01 16:12:01

标签: unicode dart flutter icu grapheme-cluster

据我所知Dart不支持字素簇,尽管有人说支持它:

在实施之前,通过字素簇进行迭代的选择有哪些?例如,如果我有这样的字符串:

String family = '\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}'; // ‍‍
String myString = 'Let me introduce my $family to you.';

在五个代码点的家庭表情符号后面有一个光标:

enter image description here

如何将光标向左移动一个用户可感知的字符?

(在这种情况下,我知道了字素簇的大小,所以我可以做到,但是我真正要问的是找到任意长的字素簇的长度。)

更新

我从this article中看到,Swift使用了系统的ICU库。 Flutter中可能有类似的事情。

补充代码

对于那些想玩我上面的示例的人,这里是一个演示项目。按钮将光标向右或向左移动。当前需要按8个按钮才能将光标移到家庭表情符号上。

enter image description here

main.dart

import 'package:flutter/material.dart';

void main() => runApp(MyApp());

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        appBar: AppBar(title: Text('Grapheme cluster testing')),
        body: BodyWidget(),
      ),
    );
  }
}

class BodyWidget extends StatefulWidget {
  @override
  _BodyWidgetState createState() => _BodyWidgetState();
}

class _BodyWidgetState extends State<BodyWidget> {

  TextEditingController controller = TextEditingController(
      text: 'Let me introduce my \u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467} to you.'
  );

  @override
  Widget build(BuildContext context) {
    return Column(
      children: <Widget>[
        TextField(
          controller: controller,
        ),
        Row(
          children: <Widget>[
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('<<'),
                onPressed: () {
                  _moveCursorLeft();
                },
              ),
            ),
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('>>'),
                onPressed: () {
                  _moveCursorRight();
                },
              ),
            ),
          ],
        )
      ],
    );
  }

  void _moveCursorLeft() {
    int currentCursorPosition = controller.selection.start;
    if (currentCursorPosition == 0)
      return;
    int newPosition = currentCursorPosition - 1;
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }

  void _moveCursorRight() {
    int currentCursorPosition = controller.selection.end;
    if (currentCursorPosition == controller.text.length)
      return;
    int newPosition = currentCursorPosition + 1;
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }
}

2 个答案:

答案 0 :(得分:5)

更新:使用https://pub.dartlang.org/packages/icu

示例代码:

import 'package:flutter/material.dart';


import 'dart:async';
import 'package:icu/icu.dart';

void main() => runApp(MyApp());

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        appBar: AppBar(title: Text('Grapheme cluster testing')),
        body: BodyWidget(),
      ),
    );
  }
}

class BodyWidget extends StatefulWidget {
  @override
  _BodyWidgetState createState() => _BodyWidgetState();
}

class _BodyWidgetState extends State<BodyWidget> {
  final ICUString icuText = ICUString('Let me introduce my \u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467} to you.\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}');
  TextEditingController controller;
  _BodyWidgetState() {
    controller = TextEditingController(
      text: icuText.toString()
  );
  }

  @override
  Widget build(BuildContext context) {
    return Column(
      children: <Widget>[
        TextField(
          controller: controller,
        ),
        Row(
          children: <Widget>[
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('<<'),
                onPressed: () async {
                  await _moveCursorLeft();
                },
              ),
            ),
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('>>'),
                onPressed: () async {
                  await _moveCursorRight();
                },
              ),
            ),
          ],
        )
      ],
    );
  }

  void _moveCursorLeft() async {
    int currentCursorPosition = controller.selection.start;
    if (currentCursorPosition == 0)
      return;
    int newPosition = await icuText.previousGraphemePosition(currentCursorPosition);
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }

  void _moveCursorRight() async {
    int currentCursorPosition = controller.selection.end;
    if (currentCursorPosition == controller.text.length)
      return;
    int newPosition = await icuText.nextGraphemePosition(currentCursorPosition);
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }
}


原始答案:

在Dart / Flutter完全实现ICU之前,我认为最好的选择是使用PlatformChannel传递Unicode字符串本机(iOS Swift4 +或Android Java / Kotlin)在那里进行迭代/操作,然后将结果发送回去

  • 对于Swift4 +,您所提到的文章是现成的(不是Swift3-,不是ObjC)
  • 对于Java / Kotlin,将Oracle的BreakIterator替换为ICU library,效果更好。除了导入语句外,没有其他更改。

我之所以建议使用本机操作(而不是在Dart上进行操作)是因为Unicode有太多要处理的事情,例如规范化,规范对等,ZWNJ,ZWJ,ZWSP等。

如果需要一些示例代码,请写下来。

答案 1 :(得分:0)

TextPainter类的源代码提供了一些如何找到字素簇的线索。具体来说,长字素簇是通过使用零宽度连接器连接代码点而创建的,因此您可以使用此知识来搜索字素簇的末端。

  // Unicode value for a zero width joiner character.
  static const int _zwjUtf16 = 0x200d;

  // Get the Rect of the cursor (in logical pixels) based off the near edge
  // of the character upstream from the given string offset.
  // TODO(garyq): Use actual extended grapheme cluster length instead of
  // an increasing cluster length amount to achieve deterministic performance.
  Rect _getRectFromUpstream(int offset, Rect caretPrototype) {
    final String flattenedText = _text.toPlainText();
    final int prevCodeUnit = _text.codeUnitAt(max(0, offset - 1));
    if (prevCodeUnit == null)
      return null;

    // Check for multi-code-unit glyphs such as emojis or zero width joiner
    final bool needsSearch = _isUtf16Surrogate(prevCodeUnit) || _text.codeUnitAt(offset) == _zwjUtf16;
    int graphemeClusterLength = needsSearch ? 2 : 1;
    List<TextBox> boxes = <TextBox>[];
    while (boxes.isEmpty && flattenedText != null) {
      final int prevRuneOffset = offset - graphemeClusterLength;
      boxes = _paragraph.getBoxesForRange(prevRuneOffset, offset);
      // When the range does not include a full cluster, no boxes will be returned.
      if (boxes.isEmpty) {
        // When we are at the beginning of the line, a non-surrogate position will
        // return empty boxes. We break and try from downstream instead.
        if (!needsSearch)
          break; // Only perform one iteration if no search is required.
        if (prevRuneOffset < -flattenedText.length)
          break; // Stop iterating when beyond the max length of the text.
        // Multiply by two to log(n) time cover the entire text span. This allows
        // faster discovery of very long clusters and reduces the possibility
        // of certain large clusters taking much longer than others, which can
        // cause jank.
        graphemeClusterLength *= 2;
        continue;
      }
      final TextBox box = boxes.first;

      // If the upstream character is a newline, cursor is at start of next line
      const int NEWLINE_CODE_UNIT = 10;
      if (prevCodeUnit == NEWLINE_CODE_UNIT) {
        return Rect.fromLTRB(_emptyOffset.dx, box.bottom, _emptyOffset.dx, box.bottom + box.bottom - box.top);
      }

      final double caretEnd = box.end;
      final double dx = box.direction == TextDirection.rtl ? caretEnd - caretPrototype.width : caretEnd;
      return Rect.fromLTRB(min(dx, width), box.top, min(dx, width), box.bottom);
    }
    return null;
  }

here is a file在Flutter的libtxt引擎的minikin库中,该库处理Grapheme集群。我不确定它是否可以直接访问,但可能对参考很有用。