将正则表达式拆分为数组

时间:2019-05-16 15:50:56

标签: javascript regex split

我在多行上有一串元素(但是如有必要,我可以将其更改为全部在一行上),我想在

元素上进行拆分。我以为这很容易,只是str.split(regex)甚至是str.split('

我尝试使用正则表达式SecRegex = /?>[\s\S]?

/; var fndSection = result.split(SecRegex);

尝试var fndSection = result.split('

我在网上查看了所有内容,从以上发现的两种方法之一应该可以奏效。

result ='     

<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter id="chap2"> <title>THEORY</title>
<section id="Thoery">
<title>theory Section</title>
<para0 verstatus="ver">
<title>Theory Para 0 </title>
<text>blah blah</text>
</para0>
</section>

<section id="Next section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="More sections">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter> <title>Chapter Title</title>
<section id="Section ID">
<title>Section Title</title>
<para0>
<title>Para0 Title</title>
<para>blah blah</para>
</para0>
</section>

<section id="Next section">
<title>title</title>
<para0>
<line>Title</line>
<text>blah blah</text>
</para0>
</section>

<section id="More sections">
<title>title</title>
<para0>
<list>Title</list>
<text>blah blah</text>
</para0>
</section>

<section id="section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<ipbchap>
<tags></tags>
</ipbchap>

</body>
<rear>
<tags></tags>
</rear>
</doc>'

代码

SecRegex = /<section.*?>[\s\S]*?<\/section>/;
var fndSection = result.split(SecRegex);

console.log("result string " + fndSection);

这是我从已有代码中得到的结果

result string <chapter id="chap2"> <title>THEORY</title> , , , , <chapter id="chap1"> <para0> <title></title></para0> </chapter> 
result string <chapter id="chap1"> <para0> <title></title></para0> </chapter> 
result string <chapter

如您所见

我想要的是将

。*?
的字符串放入数组

感谢大家对此的帮助。感谢您的帮助。

Maxine

3 个答案:

答案 0 :(得分:2)

您的表情看起来很棒!您可能只想稍作修改,也许类似于:

/<section[a-z="'\s]+>([\s\S]*?)<\/section>/gmi

RegEx

如果这不是您想要的表达式,则可以在regex101.com中修改/更改表达式。

RegEx电路

您还可以在jex.im中可视化您的表达式:

enter image description here

JavaScript测试

const regex = /<section[a-z="'\s]+>([\s\S]*?)<\/section>/gmi;
const str = `<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter id="chap2"> <title>THEORY</title>
<section id="Thoery">
<title>theory Section</title>
<para0 verstatus="ver">
<title>Theory Para 0 </title>
<text>blah blah</text>
</para0>
</section>

<section id="Next section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="More sections">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>`;
const subst = `$1`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);


如果您还想捕获section标签,则只需包装entire expression in a capturing group

const regex = /(<section[a-z="'\s]+>([\s\S]*?)<\/section>)/gmi;
const str = `<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter id="chap2"> <title>THEORY</title>
<section id="Thoery">
<title>theory Section</title>
<para0 verstatus="ver">
<title>Theory Para 0 </title>
<text>blah blah</text>
</para0>
</section>

<section id="Next section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="More sections">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>`;
const subst = `\n$1\n`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: \n', result);

答案 1 :(得分:2)

请勿在HTML(或HTML的任何表亲)上使用RegEx。将您的<section>s收集到NodeList中。将该NodeList转换为数组。将每个节点转换为字符串。这可以一行完成:

const strings = Array.from(document.querySelectorAll('section')).map(section => section.outerHTML);

以下演示是上面示例的细分。

// Collect all <section>s into a NodeList
const sections = document.querySelectorAll('section');

// Convert NodeList into an Array
const array = Array.from(sections);

/*
Iterate through Array -- on each <section>...
convert it into a String
*/
const strings = array.map(section => section.outerHTML);

// View array as a template literal for a cleaner look
console.log(`${strings}`);

// Verifying it's an array of mutiple elements
console.log(strings.length);

// Verifying that they are in fact strings
console.log(typeof strings[0]);
<chapter id="chap1">
  <para0>
    <title></title>
  </para0>
</chapter>

<chapter id="chap2">
  <title>THEORY</title>
  <section id="Thoery">
    <title>theory Section</title>
    <para0 verstatus="ver">
      <title>Theory Para 0 </title>
      <text>blah blah</text>
    </para0>
  </section>

  <section id="Next section">
    <title>title</title>
    <para0>
      <title>Title</title>
      <text>blah blah</text>
    </para0>
  </section>

  <section id="More sections">
    <title>title</title>
    <para0>
      <title>Title</title>
      <text>blah blah</text>
    </para0>
  </section>

  <section id="section">
    <title>title</title>
    <para0>
      <title>Title</title>
      <text>blah blah</text>
    </para0>
  </section>

  <chapter id="chap1">
    <para0>
      <title></title>
    </para0>
  </chapter>

  <chapter id="chap1">
    <para0>
      <title></title>
    </para0>
  </chapter>

  <chapter>
    <title>Chapter Title</title>
    <section id="Section ID">
      <title>Section Title</title>
      <para0>
        <title>Para0 Title</title>
        <para>blah blah</para>
      </para0>
    </section>

    <section id="Next section">
      <title>title</title>
      <para0>
        <line>Title</line>
        <text>blah blah</text>
      </para0>
    </section>

    <section id="More sections">
      <title>title</title>
      <para0>
        <list>Title</list>
        <text>blah blah</text>
      </para0>
    </section>

    <section id="section">
      <title>title</title>
      <para0>
        <title>Title</title>
        <text>blah blah</text>
      </para0>
    </section>

    <ipbchap>
      <tags></tags>
    </ipbchap>

答案 2 :(得分:1)

您不需要分割字符串-您要提取与之匹配的数据。您可以使用String#match进行操作。请注意,您需要添加g标志以获取所有匹配项:

var result = `<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter id="chap2"> <title>THEORY</title>
<section id="Thoery">
<title>theory Section</title>
<para0 verstatus="ver">
<title>Theory Para 0 </title>
<text>blah blah</text>
</para0>
</section>

<section id="Next section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="More sections">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter> <title>Chapter Title</title>
<section id="Section ID">
<title>Section Title</title>
<para0>
<title>Para0 Title</title>
<para>blah blah</para>
</para0>
</section>

<section id="Next section">
<title>title</title>
<para0>
<line>Title</line>
<text>blah blah</text>
</para0>
</section>

<section id="More sections">
<title>title</title>
<para0>
<list>Title</list>
<text>blah blah</text>
</para0>
</section>

<section id="section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<ipbchap>
<tags></tags>
</ipbchap>

</body>
<rear>
<tags></tags>
</rear>
</doc>`;
// the g flag is added ---------------------↓
SecRegex = /<section.*?>[\s\S]*?<\/section>/g;
var fndSection = result.match(SecRegex);


console.log("result string ", fndSection);

但是,最好还是解析DOM并从中提取所需的信息-使用DOMParser很简单:

var result = `<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter id="chap2"> <title>THEORY</title>
<section id="Thoery">
<title>theory Section</title>
<para0 verstatus="ver">
<title>Theory Para 0 </title>
<text>blah blah</text>
</para0>
</section>

<section id="Next section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="More sections">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<section id="section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter id="chap1">
<para0><title></title></para0>
</chapter>

<chapter> <title>Chapter Title</title>
<section id="Section ID">
<title>Section Title</title>
<para0>
<title>Para0 Title</title>
<para>blah blah</para>
</para0>
</section>

<section id="Next section">
<title>title</title>
<para0>
<line>Title</line>
<text>blah blah</text>
</para0>
</section>

<section id="More sections">
<title>title</title>
<para0>
<list>Title</list>
<text>blah blah</text>
</para0>
</section>

<section id="section">
<title>title</title>
<para0>
<title>Title</title>
<text>blah blah</text>
</para0>
</section>

<ipbchap>
<tags></tags>
</ipbchap>

</body>
<rear>
<tags></tags>
</rear>
</doc>`

var parser = new DOMParser();
var doc = parser.parseFromString(result, "text/html");

var sections = [...doc.getElementsByTagName("section")];
var fndSection = sections.map(section => section.outerHTML)
console.log(fndSection);