Question

好吧，所以基本上我想搜索Body标签{〜，然后得到任何跟随的东西直到〜}然后把它变成一个字符串（不包括{〜或〜}）。

Answer 1

const match = document.body.innerHTML.match(/\{~(.+)~\}/);
if (match) console.log(match[1]);
else console.log('No match found');

<body>text {~inner~} text </body>

Answer 2

＆＃13;

$(function(){

var bodyText = document.getElementsByTagName("body")[0].innerHTML;

found=bodyText.match(/{~(.*?)~}/gi);


$.each(found, function( index, value ) {
var ret = value.replace(/{~/g,'').replace(/~}/g,'');
    console.log(ret);
});

});

＆＃13;

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"></script>
   <body> {~Content 1~}

{~Content 2~}
</body>

＆＃13;

你去了，把gi放在正则表达式的末尾。

Answer 3

这是一个难以解决的问题，而不是它最初出现的问题;如果你只是抓住身体的innerHTML，脚本标签和评论之类的东西会让事情变得棘手。以下函数使用基本元素进行搜索，在您的情况下，您将要传入document.body，并返回包含找到的任何字符串的数组。

function getMyTags (baseElement) {
  const rxFindTags = /{~(.*?)~}/g;

  // .childNodes contains not only elements, but any text that
  // is not inside of an element, comments as their own node, etc.
  // We will need to filter out everything that isn't a text node
  // or a non-script tag.
  let nodes = baseElement.childNodes;
  let matches = [];
  
  nodes.forEach(node => {
    let nodeType = node.nodeType
    // if this is a text node or an element, and it is not a script tag
    if (nodeType === 3 || nodeType === 1 && node.nodeName !== 'SCRIPT') {
      let html;
      if (node.nodeType === 3) { // text node
        html = node.nodeValue;
      } else { // element
        html = node.innerHTML; // or .innerText if you don't want the HTML
      }

      let match;
      // search the html for matches until it can't find any more
      while ((match = rxFindTags.exec(html)) !== null) {
        // the [1] is to get the first capture group, which contains
        // the text we want
        matches.push(match[1]);
      }
    }
  });

  return matches;

}

console.log('All the matches in the body:', getMyTags(document.body));
console.log('Just in header:', getMyTags(document.getElementById('title')));

<h1 id="title"><b>{~Foo~}</b>{~bar~}</h1>
Some text that is {~not inside of an element~}
<!-- This {~comment~} should not be captured -->
<script>
 // this {~script~} should not be captured
</script>
<p>Something {~after~} the stuff that shouldn't be captured</p>

正则表达式/{~(.*?)~}/g的工作原理如下：

{~在{~
(.*?)捕获任何内容; ?使"non-greedy" (also known as "lazy")成为childNodes所以，如果您在我们搜索的任何字符串中有两个{~something~}实例，它会分别捕获每个实例，而不是从第一个{~捕获字符串中的最后一个~}。
~}说我们的比赛后必须有~}。

g选项使其成为“全局”搜索，这意味着它将查找字符串中的所有匹配项，而不仅仅是第一个匹配项。

进一步阅读

nodeType
Regular-Expressions.info
MDN RegExp documentation有一个很棒的正则表达式教程。
RegExr

工具

有许多不同的工具可以帮助您开发正则表达式。这是我用过的一对夫妇：

RegExPal有一个很棒的工具可以解释特定正则表达式的工作原理。

CRNA

搜索{~content~}的文档正文

3 个答案: