Question

我正在尝试清理角度代码以便稍后将其发送到jsPDF。当不能识别HTML代码时，jsPDF通常会失败，所以我试图摆脱它。

到目前为止，表达式将类似于

'<code>'.replace(/ng-\w+="(\w|\d\s)+"/,'')

这对于简单的事情很有用，但我需要一个更精细的表达方式，而我却无法遇到它。

ng-\w+=" 
#Finds expressions like ng-if, ng-model, ng-class, etc

(\w|\d\s)+ 
#This expressions only accepts spaces, numbers and digits

我真正需要的是在双引号之间获取所有内容

Answer 1

为什么不使用DOMParser，就像这样？最好不要尝试使用正则表达式解析HTML

＆＃13;

const html = `
<div id="myid" class="myclass" ng-if="ngif attribute" ng-model="ngmodel attribute" ng-class="ngclass attribute">content</div>
<div ng-if="another ngif attribute">content 2</div>
`;
const parsedDoc = new DOMParser().parseFromString(html, "text/html");
const attributesToRemove = [
  'ng-if',
  'ng-model',
  'ng-class',
];
attributesToRemove.forEach((attribName) => {
  parsedDoc.querySelectorAll('[' + attribName + ']')
    .forEach((elm) => elm.removeAttribute(attribName));
});
console.log(parsedDoc.body.innerHTML);

＆＃13;

Answer 2

扩展另一个答案...

您可以使用DOMParser，然后使用treeWalker遍历所有节点并删除以ng-开头的所有属性：

＆＃13;

const html = `
<div id="myid" class="myclass" ng-if="ngif attribute" ng-model="ngmodel attribute" ng-class="ngclass attribute">content</div>
<div ng-if="another ngif attribute">content 2</div>
`;
const el = new DOMParser().parseFromString(html, "text/html");

var treeWalker = document.createTreeWalker(
  el,
  NodeFilter.SHOW_ELEMENT,
  { acceptNode: function(node) { return NodeFilter.FILTER_ACCEPT; } },
  false
);

var nodeList = [];

while(treeWalker.nextNode()) {
    Array.apply(null, treeWalker.currentNode.attributes).
        filter(a => a.name.startsWith('ng-')).
        forEach((attr, index) => {
            treeWalker.currentNode.removeAttribute(attr.name);
        });
}
console.log(el.documentElement.querySelector('body').innerHTML);

＆＃13;

Answer 3

可以试试这个/ng-\w+=("|').*?\1/

使用Javascripts的正则表达式

3 个答案: