Question

我们正在使用JS加载JSON数据，该数据通常在换行符前具有多个反斜杠。示例：

{
    "test": {
        "title": "line 1\\\\\\\nline2"
    }
}

我已经尝试了使用replace的各种RegEx模式。 “奇怪”，如果反斜杠的数目为偶数，但似乎不起作用，它们似乎可以工作。

此示例带有2个反斜杠：

"\\n".replace(/\\(?=.{2})/g, '');

这个样本只有3个：

"\\\n".replace(/\\(?=.{2})/g, '');

这是实际使用的js：

console.log('Even Slashes:');
console.log("\\n".replace(/\\(?=.{2})/g, ''));
console.log('Odd Slashes:');
console.log("\\\n".replace(/\\(?=.{2})/g, ''));

Answer 1

我认为您正在尝试删除换行符str.replace(/\\+\n/g, "\n")前的所有反斜杠。

此外，您可能会误解how escape sequences work：

"\\"是一个反斜杠
"\\n"是一个反斜杠，后跟字母n

请参阅下面的代码进行解释，并请注意，Stack Overflow的控制台输出正在重新编码字符串，但是如果您检查实际的dev工具，则更好并显示已编码的字符。

const regex = /\\+\n/g;
// This is "Hello" + [two backslashes] + "nworld"
const evenSlashes = "Hello\\\\nworld";
// This is "Hello" + [two backslashes] + [newline] + "world"
const oddSlashes = "Hello\\\\\nworld";
console.log({
   evenSlashes,
   oddSlashes,
   // Doesn't replace anything because there's no newline on this string
   replacedEvenSlashes: evenSlashes.replace(regex, "\n"),
   // All backslashes before new line are replaced
   replacedOddSlashes: oddSlashes.replace(regex, "\n")
});

Answer 2

正如我在前面的评论中提到的，您在这里处理两个不同的转义序列：

\n是换行符的换码序列，即 Unicode字符'LINE FEED（LF）'（U + 000A）
\\是反斜杠的转义序列，即 Unicode字符'REVERSE SOLIDUS'（U + 005C）

尽管这些转义序列在源代码中是两个字符，但它们实际上仅代表内存中的一个字符。

观察：

const toEscaped = s => s.toSource().match(/"(.*)"/)[0];
['\n', '\\n', '\\\n', '\\\\n', '\\\\\n']
  .forEach(s => console.log(`There are ${s.length} character(s) in ${toEscaped(s)}`))

这也适用于正则表达式。 \n实际上算作一个字符，因此超前的(?=.{2})也会尝试捕获前面的\，这就是为什么您在替换工作方式上可能会有些陌生的原因。

但是，基于阅读您的一些评论，听起来您可能正在处理不正确的编码。例如，在某些情况下，用户在输入字段中输入foo\nbar，这被解释为文字\，后跟n（即"foo\\nbar"），现在您想将其解释为换行符（即"foo\nbar"）。在这种情况下，您实际上并不是要删除\个字符，而是要将字符序列\ + n转换为\n。

以下代码段显示了如何对\\和\n执行转义序列替换：

const toEscaped = s => s.toSource().match(/"(.*)"/)[0];
const toHex = s => Array.from(s).map((_, i) => s.charCodeAt(i).toString(16).padStart(2, '0')).join('+');
['\n', '\\n', '\\\n', '\\\\n', '\\\\\n']
  .map(s => ({ a: s, b: s.replace(/\\n/g, '\n').replace(/\\\\/g, '\\') }))
  .forEach(({a, b}) => console.log(`${toEscaped(a)} --> ${toHex(b)}`))

并都将"\\n"替换为"\n"，并删除"\\"个字符，然后尝试执行以下操作：

const toEscaped = s => s.toSource().match(/"(.*)"/)[0];
const toHex = s => Array.from(s).map((_, i) => s.charCodeAt(i).toString(16).padStart(2, '0')).join('+');
['\n', '\\n', '\\\n', '\\\\n', '\\\\\n']
  .map(s => ({ a: s, b: s.replace(/\\+[n\n]/g, '\n') }))
  .forEach(({a, b}) => console.log(`${toEscaped(a)} --> ${toHex(b)}`))

Answer 3

要从源文本中删除所有转义的转义，它是
查找：/([^\\]|^)(?:\\\\)+/g替换\1

使用javascript正则表达式删除多个反斜杠，同时保留\ n特殊字符

3 个答案: