Question

我有以下字符串：

let html = `<!DOCTYPE html>
<html xmlns="https://www.w3.org/1999/xhtml">
    <head>
        <title>Hello, world!</title>
    </head>
    <body>
        <p>Hello, world!</p>
    </body>
</html>`;

如何只提取开头的HTML标记？我只需要：

'<html xmlns="https://www.w3.org/1999/xhtml">'

如果这是最佳方式，请建议正则表达式。

Answer 1

假设您要捕获<html>标记，只需使用/<html.*>/。

这只是搜索<html后跟任意数量的字符，然后在下一个>结束。

这可以在以下内容中看到：

＆＃13;

let html = `<!DOCTYPE html>
<html xmlns="https://www.w3.org/1999/xhtml">
    <head>
        <title>Hello, world!</title>
    </head>
    <body>
        <p>Hello, world!</p>
    </body>
</html>`;

console.log(html.match(/<html.*>/)[0]);

＆＃13;

在 Regex101 here 上看到。

Answer 2

如果您只想提取第二行，可以将字符串拆分\ n并获取所需行的值

let html = `<!DOCTYPE html>
<html xmlns="https://www.w3.org/1999/xhtml">
    <head>
        <title>Hello, world!</title>
    </head>
    <body>
        <p>Hello, world!</p>
    </body>
</html>`;

var lines = html.split(/\n/g);
console.log(lines[1]);

JavaScript从字符串中提取HTML标记

2 个答案: