如何只从html标签获取文本

时间:2017-01-26 04:33:39

标签: javascript jquery string text gettext

我查询了一些数据,结果就像这样

<p><img src="xxx.png" alt="" style="margin&#58;5px;" /><br></p><p>Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.<br></p>

在控制台上显示。我想从这些数据中删除所有html标签,只获得这样的字符串

  

Lorem Ipsum只是印刷和排版行业的虚拟文本。自16世纪以来,Lorem Ipsum一直是业界标准的虚拟文本,当时一台未知的打印机采用了类型的厨房,并将其拼凑成一个类型。

任何人都知道如何从此数据或某些解决方案中删除单引号和双引号。感谢

8 个答案:

答案 0 :(得分:2)

您可以创建一个临时元素并读取它的.textContent属性:

var d = document.createElement('div');
d.innerHTML = htmlContent;
var textContent = d.textContent || d.innerText;

如果你可以使用jQuery:

var textContent = $('<div/>').html(htmlContent).text();

答案 1 :(得分:2)

使用正则表达式

function RemoveHTMLTags(html) {
            var regX = /(<([^>]+)>)/ig;                
            alert(html.replace(regX, ""));
}

答案 2 :(得分:2)

只需附加&amp;试试这个简单的jQuery

<div id="output"><div>

<Script type="text/javascript">
  $("#output").html($("p").text());
</script>

答案 3 :(得分:1)

&#13;
&#13;
console.log($('p').text())
&#13;
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<p><img src="xxx.png" alt="" style="margin&#58;5px;" /><br></p><p>Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.<br></p>
&#13;
&#13;
&#13;

使用.text()只获得t ext

答案 4 :(得分:1)

您可以使用innerHTML

来获取它

在你的JavaScript中

var a = document.getElementById("para") //Let us say your paragraph id is "para"

var b - a.innerHTML;

现在b将包含段落中的字符串。

var a = document.getElementById("para");

var b = a.innerHTML;
alert(b);
<p><img src="xxx.png" alt="" style="margin&#58;5px;"/><br></p> 
<p id = "para">Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.<br></p>

您还可以阅读innerHTML的Mozilla文档。

此外,我发现这个很棒的Guide可以帮助你将来帮助你解释innerHTMLinnerTexttextcontent之间的区别。

同时查看此问题Get innerHtml but remove unwanted tags

答案 5 :(得分:1)

<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>
<p><img src="xxx.png" alt="" style="margin&#58;5px;" /><br></p><p id="page">Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.<br></p>

<button onclick="getTextOnly()" >Show Text</button>
<script>
function getTextOnly(){
var pTag = document.getElementById("page");
textOnly = pTag.innerHTML;
textOnly = textOnly.replace("<br>"," ");
alert(textOnly);
}
</script>
</body>
</html>

这个怎么样?如果您只想获取

标记内的文本,则应为此

标记设置id。只是建议。

答案 6 :(得分:1)

您需要考虑以下几点: a)如果要获取特定类型的所有DOM元素的文本。

如果您这样做,请使用以下内容:

<div>
  A lot of content here
</div>

var data = $('div');
console.log(data.innerHTML);

否则,将一个类或id分别添加到您需要其数据的元素/元素中,然后使用上面的代码,但替换&#34; div&#34;为你的班级/身份。

答案 7 :(得分:0)

在此示例中,

jQuery和香草Javascript也是如此:

//jQuery way:
console.log($('p').text())

// OR can be using vanilla JS:
let para = document.getElementsByTagName('p'); // this can be also by getElementById()

console.log(para[1].innerText);// we are using [1] because we have two <p> tags
<p><img src="xxx.png" alt="" style="margin&#58;5px;" /><br></p><p>Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.<br></p>
<script
  src="https://code.jquery.com/jquery-3.4.1.min.js"
  integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo="
  crossorigin="anonymous"></script>