替换工作笔记中无法打印的ASCII字符

时间:2019-04-09 13:35:24

标签: javascript regex

我正在尝试在工作说明中添加可能包含不可打印的ASCII字符的文本。这些字符在存储到数据库之前不会按预期替换。

<work_notes>
TEST
X  000  000  0x00  00000000  NUL  (Null char.)
  001  001  0x01  00000001  SOH  (Start of Header)
  002  002  0x02  00000010  STX  (Start of Text)
  003  003  0x03  00000011  ETX  (End of Text)
  004  004  0x04  00000100  EOT  (End of Transmission)
  005  005  0x05  00000101  ENQ  (Enquiry)
  006  006  0x06  00000110  ACK  (Acknowledgment)
  007  007  0x07  00000111  BEL  (Bell)
  008  010  0x08  00001000   BS  (Backspace)
      009  011  0x09  00001001   HT  (Horizontal Tab)

  010  012  0x0A  00001010   LF  (Line Feed)
  011  013  0x0B  00001011   VT  (Vertical Tab)
  012  014  0x0C  00001100   FF  (Form Feed)

  013  015  0x0D  00001101   CR  (Carriage Return)
  014  016  0x0E  00001110   SO  (Shift Out)
  015  017  0x0F  00001111   SI  (Shift In)
  016  020  0x10  00010000  DLE  (Data Link Escape)
  017  021  0x11  00010001  DC1  (XON)(Device Control 1)
  018  022  0x12  00010010  DC2  (Device Control 2)
  019  023  0x13  00010011  DC3  (XOFF)(Device Control 3)
  020  024  0x14  00010100  DC4  (Device Control 4)
  021  025  0x15  00010101  NAK  (Negative Acknowledgement)
  022  026  0x16  00010110  SYN  (Synchronous Idle)
  023  027  0x17  00010111  ETB  (End of Trans. Block)
  024  030  0x18  00011000  CAN  (Cancel)
  025  031  0x19  00011001   EM  (End of Medium)
  026  032  0x1A  00011010  SUB  (Substitute)
  027  033  0x1B  00011011  ESC  (Escape)
  028  034  0x1C  00011100   FS  (File Separator)
  029  035  0x1D  00011101   GS  (Group Separator)
  030  036  0x1E  00011110   RS  (Request to Send)(Record Separator)
  031  037  0x1F  00011111   US  (Unit Separator)
</work_notes>

工作说明中显示的正方形是实际字符,但在文本区域中未显示。

我写的用来替换Escape字符的代码是

/**
 * Escape a string for XML.
 * @param {String} txt
 * @return {String}
 */
ImDataHelper.escapeXml = function (txt) {
  var str = txt;
  // Replace the escape character.
  txt = str.replace(/x1B/g,''); 
  // copied from SOAPMessage script include
  return Packages.org.apache.commons.lang.StringEscapeUtils.escapeXml(txt);
};

运行该事务的输出如下

<work_notes>2019-04-09 13:31:37 - Shaji Kalidasan (Work Notes)
TEST
X  000  000  0x00  00000000  NUL  (Null char.)
  001  001  0x01  00000001  SOH  (Start of Header)
  002  002  0x02  00000010  STX  (Start of Text)
  003  003  0x03  00000011  ETX  (End of Text)
  004  004  0x04  00000100  EOT  (End of Transmission)
  005  005  0x05  00000101  ENQ  (Enquiry)
  006  006  0x06  00000110  ACK  (Acknowledgment)
  007  007  0x07  00000111  BEL  (Bell)
  008  010  0x08  00001000   BS  (Backspace)
      009  011  0x09  00001001   HT  (Horizontal Tab)

  010  012  0x0A  00001010   LF  (Line Feed)
  011  013  0x0B  00001011   VT  (Vertical Tab)
  012  014  0x0C  00001100   FF  (Form Feed)

  013  015  0x0D  00001101   CR  (Carriage Return)
  014  016  0x0E  00001110   SO  (Shift Out)
  015  017  0x0F  00001111   SI  (Shift In)
  016  020  0x10  00010000  DLE  (Data Link Escape)
  017  021  0x11  00010001  DC1  (XON)(Device Control 1)
  018  022  0x12  00010010  DC2  (Device Control 2)
  019  023  0x13  00010011  DC3  (XOFF)(Device Control 3)
  020  024  0x14  00010100  DC4  (Device Control 4)
  021  025  0x15  00010101  NAK  (Negative Acknowledgement)
  022  026  0x16  00010110  SYN  (Synchronous Idle)
  023  027  0x17  00010111  ETB  (End of Trans. Block)
  024  030  0x18  00011000  CAN  (Cancel)
  025  031  0x19  00011001   EM  (End of Medium)
  026  032  0x1A  00011010  SUB  (Substitute)
  027  033  0  00011011  ESC  (Escape)
  028  034  0x1C  00011100   FS  (File Separator)
  029  035  0x1D  00011101   GS  (Group Separator)
  030  036  0x1E  00011110   RS  (Request to Send)(Record Separator)
  031  037  0x1F  00011111   US  (Unit Separator)
</work_notes>

如您所见,它仅替换了'0x1B'中的'x1B',而不是正方形中显示的实际ASCII转义字符。

请帮助

1 个答案:

答案 0 :(得分:0)

您需要在正则表达式中使用\对这些ASCII代码进行转义(我在示例中记录了字符串长度,因为该字符不会在控制台中显示):

var str = String.fromCharCode(27) + '  027  033  0x1B  00011011  ESC  (Escape)';

console.log('length: ', str.length);

str = str.replace(/\x1B/g, '');

console.log('length: ', str.length);

See in Regex101

注意:您也可以按以下间隔使用它们:/[\x00-\x1F]/g

See in Regex101