Question

我试图从字符串中获取所有html标签，无一例外。只是为了澄清，它只需要严格的字符串，而不必转换为html对象。我创建了一个正则表达式，但它只抓取没有内容的标签。

＆＃13;

var text = '<div class="mura-region-local"><p>In October 2010, Lisa and Eugene Jeffers learned that their daughter Jade, then nearly 2 and a half years old, has autism. The diagnosis felt like a double whammy. The parents were soon engulfed by stress from juggling Jade’s new therapy appointments and wrangling with their health insurance provider, but they now had an infant son to worry about, too. Autism runs in families. Would Bradley follow in his big sister’s footsteps?</p></div><img href=""/>'

var match = text.match(/<?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[\^'">\s]+))?)+\s*|\s*)?>/g);

console.log(match);

＆＃13;

Answer 1

您无法为所有可能的标记找到<smth>...</smth>对。你也不能制作能识别tagB中tagA和tagA中tagB的正则表达式的正则表达式。你必须直接编写所有这些组合，这使得这样的正则表达式成为不可能。

但如果您的意思是仅想要<smth ....>，</smth>和<smth..../>标记，而不检查它们的正确顺序，则可能。

<(?:\w+(?:\s+\w+=(?:"[^"]*"|'[^']*'))*\/?|(?:\/\w+))>

Here是测试。

从字符串中获取所有html标记，包括其内容（仅限Regex）

1 个答案: