Question

我有以下字符串：

[SM_g]此[SM_h] [SM_g]是[SM_h] [SM_g]一[SM_h] [SM_g]句子。[SM_h] [SM_l] [SM_g]这里[SM_h] [SM_g]是[SM_h] [ SM_g]另一[SM_h] [SM_g]句子。[SM_h] [SM_1]

我可以将该字符串转换为此字符串，然后将其显示在<p>元素中：

这是一句话。

这是另一句话。

使用以下代码：

tokenResponseText_initial = "[SM_g]This[SM_h][SM_g]is[SM_h][SM_g]a[SM_h][SM_g]sentence.[SM_h][SM_l][SM_g]Here[SM_h][SM_g]is[SM_h][SM_g]another[SM_h][SM_g]sentence.[SM_h][SM_1]"

const newLineIndicator = "insert-double-new-line"

const tokenResponseText_fixedNewLines = tokenResponseText_initial.replace(/(\[SM_g].*?)(\[SM_h]\[SM_l])/g, "$1" + newLineIndicator + "$2");

const wordCompilationRegex = /\[SM_g](.*?)\[SM_h]/g;
var wordRegexResponse;

var summary = "";

do {
  wordRegexResponse = wordCompilationRegex.exec(tokenResponseText_fixedNewLines);
  if (wordRegexResponse) {
    if (wordRegexResponse[1].includes(newLineIndicator)) {
      summary += wordRegexResponse[1].replace(newLineIndicator, "") + "\n\n";
    } else {
      summary += wordRegexResponse[1] + " ";
    }
  }
} while (wordRegexResponse);
//The following is rough code, 
someParagraphElement.innerHTML = summary;

p {
  white-space: pre-line;
}

<p id="someParagraphElement"></p>

其中paragraph元素具有以下属性white-space: pre-line;

但是，理想情况下，为了在两个句子之间创建双重换行符，我想删除newLineIndicator的使用并简单地执行此操作：

但是，第二种方法不起作用。当我向控制台打印tokenResponseText_fixedNewLines时，最终结果并没有结束双重新行，即使，似乎插入了双新行，如下所示：

[SM_g]此[SM_h] [SM_g]是[SM_h] [SM_g]一[SM_h] [SM_g]句子。

[SM_h] [SM_l] [SM_g]这里[SM_h] [SM_g]是[SM_h] [SM_g]另一[SM_h] [SM_g]句子。[SM_h] [SM_1]

tokenResponseText_initial = "[SM_g]This[SM_h][SM_g]is[SM_h][SM_g]a[SM_h][SM_g]sentence.[SM_h][SM_l][SM_g]Here[SM_h][SM_g]is[SM_h][SM_g]another[SM_h][SM_g]sentence.[SM_h][SM_1]"

const tokenResponseText_fixedNewLines = tokenResponseText_initial.replace(/(\[SM_g].*?)(\[SM_h]\[SM_l])/g, "$1\n\n$2");

const wordCompilationRegex = /\[SM_g](.*?)\[SM_h]/g;
var wordRegexResponse;

var summary = "";

do {
  wordRegexResponse = wordCompilationRegex.exec(tokenResponseText_fixedNewLines);
  if (wordRegexResponse) {
    summary += wordRegexResponse[1] + " ";

  }
} while (wordRegexResponse);
//The following is rough code, 
someParagraphElement.innerHTML = summary;

p {
  white-space: pre-line;
}

<p id="someParagraphElement"></p>

为什么第二种方法不起作用，虽然第一种方法有效？ .*?不会捕获新行吗？

Answer 1

默认情况下，.与换行符不匹配。因此，当您使用第一个.replace()将换行符插入字符串时，某些[SM_g]word[SM_h]序列不再与wordCompilationRegex匹配。

您可以使用s修饰符来.匹配换行符。

糟糕这是Chrome扩展程序，而不是Javascript标准标记。使用[\s\S]代替.。

tokenResponseText_initial = "[SM_g]This[SM_h][SM_g]is[SM_h][SM_g]a[SM_h][SM_g]sentence.[SM_h][SM_l][SM_g]Here[SM_h][SM_g]is[SM_h][SM_g]another[SM_h][SM_g]sentence.[SM_h][SM_1]"

const tokenResponseText_fixedNewLines = tokenResponseText_initial.replace(/(\[SM_g].*?)(\[SM_h]\[SM_l])/g, "$1\n\n$2");

const wordCompilationRegex = /\[SM_g]([\s\S]*?)\[SM_h]/g;
var wordRegexResponse;

var summary = "";

do {
  wordRegexResponse = wordCompilationRegex.exec(tokenResponseText_fixedNewLines);
  if (wordRegexResponse) {
    summary += wordRegexResponse[1] + " ";

  }
} while (wordRegexResponse);
//The following is rough code, 
someParagraphElement.innerHTML = summary;

p {
  white-space: pre-line;
}

<p id="someParagraphElement"></p>

Answer 2

您的新代码没有在句子之间添加换行符，因为您的代码中没有任何内容可以执行此操作。

你想要做的是1）检测是否有新句子。 2）当有一个新句子时，在它们之间插入两行。

为了检测新句子，请使用let index = summary.indexOf('.')，如果索引不是--1且不等于summary.length - 1（即字符串的最后一个字符）），那么你已经检测到至少有两个句子。
如果您检测到有两个句子，则在它们之间插入换行符只需using array.splice()。

注意：如果您需要处理可能有两个以上的句子，那么我建议编写代码来处理两个句子（如上所述），然后修改它以遍历所有索引as the first answer to this question explains

如果你开始编写代码并陷入困境/有更多问题，只需评论这个答案，我就会回复。

Javascript - 如何更清洁/有效地使用正则表达式执行此字符串操作？

2 个答案: