正则表达式删除带有特殊字符的行

时间:2013-01-06 17:33:23

标签: regex

        <a class='jdr' href='javascript:void(0);' onClick="return openDiv('jrtp');"></a>            
        <span class="jcn">
            <a href="http://example.com/Ahmedabad/Aptech-N-Power-Hardware-Networking-&lt;near&gt;-Toll-Naka-Opp-Kakadia-Hospital-Below-Sankalp-Reataurant-Bapu-Nagar/079PXX79-XX79-110420173655-D4K6_QWhtZWRhYmFkIENDTkEgVHJhaW5pbmcgSW5zdGl0dXRlcw==_BZDET" title='Aptech N Power Hardware & Networking' >Aptech N Power Hardware & Networkin...</a>
        </span>         

            <section class="jrat">
                <a rel="nofollow" href="http://example.com/Ahmedabad/Aptech-N-Power-Hardware-Networking-&lt;near&gt;-Toll-Naka-Opp-Kakadia-Hospital-Below-Sankalp-Reataurant-Bapu-Nagar/079PXX79-XX79-110420173655-D4K6_QWhtZWRhYmFkIENDTkEgVHJhaW5pbmcgSW5zdGl0dXRlcw==_BZDET#rvw"><span class='s10'></span><span class='s10'></span><span class='s10'></span><span class='s10'></span><span class='s0'></span></a> 
                                        <a class="jrt" href="http://example.com/Ahmedabad/Aptech-N-Power-Hardware-Networking-&lt;near&gt;-Toll-Naka-Opp-Kakadia-Hospital-Below-Sankalp-Reataurant-Bapu-Nagar/079PXX79-XX79-110420173655-D4K6_QWhtZWRhYmFkIENDTkEgVHJhaW5pbmcgSW5zdGl0dXRlcw==_BZDET#rvw">2 ratings</a>
                    <span class="jrt"> |</span>
                                    <a class="rate_this" onclick="_ct('ratethis','lspg');"  href="http://example.com/Ahmedabad/Aptech-N-Power-Hardware-Networking-&lt;near&gt;-Toll-Naka-Opp-Kakadia-Hospital-Below-Sankalp-Reataurant-Bapu-Nagar/079PXX79-XX79-110420173655-D4K6_QWhtZWRhYmFkIENDTkEgVHJhaW5pbmcgSW5zdGl0dXRlcw==_BZDET/writereview">Rate this</a>
            </section>                      
        <section class="jcar">
            <section class="jbc">
                                        <a href="http://example.com/Ahmedabad/Aptech-N-Power-Hardware-Networking-&lt;near&gt;-Toll-Naka-Opp-Kakadia-Hospital-Below-Sankalp-Reataurant-Bapu-Nagar/079PXX79-XX79-110420173655-D4K6_QWhtZWRhYmFkIENDTkEgVHJhaW5pbmcgSW5zdGl0dXRlcw==_BZDET">
                        <img width="83" height="56" border="0" src="http://images.jdmagicbox.com/upload_test/ahmedabad/b4/079pxx79.xx79.110420172948.d4b4/logo/faf3f2409ed7993aaa70f848ab0bb6fb_t.jpg" class="Clogo" />
                    </a>
                                        <!-- <span class="noLogo"></span> -->
                                                        <section class="jrcl">
                    <p>
                        **A/35, Lakhani Chamber, Toll Naka, Opp Kakadia Hospital, Below Sankalp Reataurant, Bapu Nagar, Ahmedabad - 380024**                                                                |<a href="http://example.com/Ahmedabad/Aptech-N-Power-Hardware-Networking-&lt;near&gt;-Toll-Naka-Opp-Kakadia-Hospital-Below-Sankalp-Reataurant-Bapu-Nagar/079PXX79-XX79-110420173655-D4K6_QWhtZWRhYmFkIENDTkEgVHJhaW5pbmcgSW5zdGl0dXRlcw==_BZDET/map"> View Map</a><br>
                                                    </p>

从上面的XML数据中我想提取以下内容--- A / 35,Lakhani Chamber,Toll Naka,Opp Kakadia医院,Sankalp Reataurant以下,Bapu Nagar,艾哈迈达巴德 - 380024

我需要帮助创建一个正则表达式来查找和删除包含特殊字符的所有行。 我正在使用以下正则表达式---- /(\<.+?>)/g 请帮忙。谢谢

2 个答案:

答案 0 :(得分:0)

我想你要删除HTML标签的行,所以试试这个:

/^<.*>\n/g

答案 1 :(得分:0)

试试这个

/(?<=\*{2})([^<>]*?)(?=\*{2})/g

它匹配**

之间的所有内容