我有一个简单但格式不正确的html页面,其中包含所有错误:
<HTML>
<head>
<title>Official game sheet</title>
</head>
<body class="sheet">
</BODY>
</HTML>
试图在从此html解析的文档上应用xpath // title。
const document = parse5.parse(xmlString);
const xhtml = xmlser.serializeToString(document);
const doc = new dom().parseFromString(xhtml);
const select = xpath.useNamespaces({
"x": "http://www.w3.org/1999/xhtml"
});
const nodes = select("//title", doc);
console.log(nodes);
尝试解决方案from here失败。返回的节点列表为空。
答案 0 :(得分:2)
这里是@neptune,您不需要parse5或xmlser,仅需要xpath和xmldom。
var xpath = require('xpath');
var dom = require('xmldom').DOMParser;
var xmlString = `
<HTML>
<head>
<title>Official game sheet</title>
<custom>Here we are</custom>
<body class="sheet">
</BODY>
</HTML>`;
//const document = parse5.parse(xmlString);
//const xhtml = xmlser.serializeToString(document);
const doc = new dom().parseFromString(xmlString);
const nodes = xpath.select("//custom", doc);
//console.log(document);
console.log(nodes[0].localName + ": " + nodes[0].firstChild.data);
console.log("Node: " + nodes[0].toString());
答案 1 :(得分:1)
请更正各行以获得标题
const nodes = select("//x:title//text()", doc);
console.log(nodes[0].data)