Question

所以我在textarea元素中得到了以下输入：

<quote>hey</quote>

what's up?

我想在<quote>和</quote>之间分隔文字（因此结果将是＆＃39;嘿＆＃39;在这种情况下没有别的。

我尝试使用.replace和以下正则表达式，但它没有达到正确的结果，我不明白为什么：

quoteContent = value.replace(/<quote>|<\/quote>.*/gi, ''); （结果是'hey what's up'它没有删除最后一部分，在这种情况下'what's up'，它只删除引号标记）

有人知道如何解决这个问题吗？

Answer 1

即使它只是一个小的html片段，don't use regex to do any html parsing。相反，取值，使用DOM方法并从元素中提取文本。更多代码，但更好，更安全的方式：

const el = document.getElementById('foo');
const tmp = document.createElement('template');
tmp.innerHTML = el.value;
console.log(tmp.content.querySelector('quote').innerText);

<textarea id="foo">
<quote>hey</quote>

what's up?
</textarea>

Answer 2

您也可以尝试使用match方法：

quoteContent = value.match(/<quote>(.+)<\/quote>/)[1];

Answer 3

您应该尽量避免使用正则表达式解析HTML。

<quote><!-- parsing HTML is hard when </quote> can appear in a comment -->hey</quote>

您可以使用DOM为您完成。

// Parse your fragment
let doc = new DOMParser().parseFromString(
    '<quote>hey</quote>\nWhat\'s up?', 'text/html')
// Use DOM lookup to find a <quote> element and get its
// text content.
let { textContent } = doc.getElementsByTagName('quote')[0]
// We get plain text and don't need to worry about "&lt;"s
textContent === 'hey'

Answer 4

点.与新行不匹配。

试试这个：

//(.|\n)* will match anything OR a line break
quoteContent = value.replace(/<quote>|<\/quote>(.|\n)*/gi, '');

正则表达式：抓取部分字符串

4 个答案: