Question

我正在寻找一种使用NodeJS从后端的字符串中剥离标签的方法。这里的一些答案建议尝试node-validator，但文档和任何答案都没有具体说明如何使用它。

例如，我在这样的变量中有一个字符串：

INPUT：

var text = 'Hello there! I am a string but not a very exciting one!'

期望的输出：

var newText = Hello there! I am a string but not a very exciting one!

node-validator个文档有几个选项，我认为最相关的是trim()函数：

var check = require('validator').check,
    sanitize = require('validator').sanitize

//Validate
check('test@email.com').len(6, 64).isEmail();        //Methods are chainable
check('abc').isInt();                                //Throws 'Invalid integer'
check('abc', 'Please enter a number').isInt();       //Throws 'Please enter a number'
check('abcdefghijklmnopzrtsuvqxyz').is(/^[a-z]+$/);

//Sanitize / Filter
var int = sanitize('0123').toInt();                  //123
var bool = sanitize('true').toBoolean();             //true
var str = sanitize(' \t\r hello \n').trim();       //'hello'
var str = sanitize('aaaaaaaaab').ltrim('a');         //'b'
var str = sanitize(large_input_str).xss();
var str = sanitize('&lt;a&gt;').entityDecode();      //'<a>'

是否可以使用它从字符串中去除标签（以及类）？

编辑：我还加载了cheerio（基本上是jquery）并尝试使用类似的内容：

HTML
<div class="select">
<p><b>Hello there!</b> I am a string <span class="small">but not a very exciting one!</span></p>
</div>

JAVASCRIPT
(function() {
    var text = $(.select *).each(function() {
        var content = $(this).contents();
        $(this).replaceWith(content);
    }
    );
    return text;
}
());

但这导致'Object 'Hello....' has no method "contents"'错误，如果使用jQuery更容易，我可以使用类似的函数。

Answer 1

我不使用node-validator但这样的东西对我有用

var text = '<p><b>Hello there!</b> I am a string <span class="small">but not a very    exciting one!</span></p>

text.replace(/(<([^>]+)>)/ig,"");

输出

你好！我是一个字符串，但不是一个非常令人兴奋的字符串！

现在您可以使用节点验证器修剪它。

获得here

的代码段

Answer 2

好吧，我还没有看完你的完整问题。但你可以使用。获得所需的输出 string.js 节点模块。您可以使用节点

进行安装

以下是我使用的代码 - ＆gt;

var S = require('string'); var text = 'Hello there! I am a string but not a very exciting one!'; console.log(text); text = S(text).stripTags().s; console.log(text);

<强>输出 -

Hello there! I am a string but not a very exciting one! Hello there! I am a string but not a very exciting one!

如何安装string.js？

npm install --save string

Further reference

Answer 3

看起来node-validator没有内置任何类型的HTML标记剥离，trim()不起作用，因为它似乎只能指定要删除的单个字符。它非常容易扩展，因此您可以为其编写扩展来删除HTML标记。

否则，您可以使用cheerio .text()（docs）方法获取元素及其后代的组合文本内容。

这样的事情应该有效：

$('.select *').each(function() {
    var content = $(this).text();
    $(this).replaceWith(content);
}

这将删除.select中的所有html，如果您希望更换*，请移除.select。

使用node-validator从字符串中删除HTML标记

3 个答案: