正则表达式捕获组中的链接

时间:2016-02-18 19:10:20

标签: regex string regex-group

我需要一些帮助才能捕获此类来源的所有链接:

<div class="rc" data-hveid="64">
  <h3 class="r"><a href="http://www.exploit-id.com/author/admin/page/192" onmousedown="return rwt(this,'','','','6','AFQjCNH5m9XNb8x4PVBUhm7rGbpP2-4bGQ','VrQPXm7M8AehGFqUhVzB4g','0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQFghBMAU','','',event)">Exploit-ID » admin</a></h3>
  <div class="s">
    <div>
      <div class="f kv _SWb" style="white-space:nowrap"><cite class="_Rm">www.exploit-id.com/author/admin/page/192</cite>
        <div class="action-menu ab_ctl"><a class="_Fmb ab_button" href="#" id="am-b5" aria-label="Result details" aria-expanded="false" aria-haspopup="true" role="button" jsaction="m.tdd;keydown:m.hbke;keypress:m.mskpe" data-ved="0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQ7B0IQjAF"><span class="mn-dwn-arw"></span></a>
          <div
          class="action-menu-panel ab_dropdown" role="menu" tabindex="-1" jsaction="keydown:m.hdke;mouseover:m.hdhne;mouseout:m.hdhue" data-ved="0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQqR8IQzAF">
            <ul>
              <li class="action-menu-item ab_dropdownitem" role="menuitem"><a class="fl" href="http://webcache.googleusercontent.com/search?q=cache:YjSqy1rHChIJ:www.exploit-id.com/author/admin/page/192+&amp;cd=6&amp;hl=en&amp;ct=clnk&amp;gl=ro" onmousedown="return rwt(this,'','','','6','AFQjCNG68C2DPiAcZmzMREpEY6Jr6vk_yA','FxEOdBFVsvQ3dMMfmjFVHA','0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQIAhEMAU','','',event)">Cached</a>
              </li>
            </ul>
        </div>
      </div>
    </div><span class="st"><span class="f">Oct 4, 2011 - </span>Email : f3arm3d3ar@gmail.com Google-<em>Dork</em> : :) Guess it. <em>Tested</em> on : Ubuntu 10.04 Web-App ... <em>2</em>] The Blue Genius : My Boss. 3] str0ke&nbsp;...</span>
  </div>
</div>
</div>
<!--n-->
</div>
<div class="g">
  <!--m-->
  <div class="rc" data-hveid="70">
    <h3 class="r"><a href="http://arstechnica.com/civis/viewtopic.php?t=472903" onmousedown="return rwt(this,'','','','7','AFQjCNH2u8FitCiv8nceTecHv32S1Q4xxw','IimG-UIeIr4VW8IZui9CDg','0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQFghHMAY','','',event)">The Script Thread - Ars Technica OpenForum</a></h3>
    <div class="s">
      <div>
        <div class="f kv _SWb" style="white-space:nowrap"><cite class="_Rm">arstechnica.com/civis/viewtopic.php?t=472903</cite>
          <div class="action-menu ab_ctl"><a class="_Fmb ab_button" href="#" id="am-b6" aria-label="Result details" aria-expanded="false" aria-haspopup="true" role="button" jsaction="m.tdd;keydown:m.hbke;keypress:m.mskpe" data-ved="0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQ7B0ISDAG"><span class="mn-dwn-arw"></span></a>
            <div
            class="action-menu-panel ab_dropdown" role="menu" tabindex="-1" jsaction="keydown:m.hdke;mouseover:m.hdhne;mouseout:m.hdhue" data-ved="0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQqR8ISTAG">
              <ul>
                <li class="action-menu-item ab_dropdownitem" role="menuitem"><a class="fl" href="http://webcache.googleusercontent.com/search?q=cache:5Jwa1Yuai-UJ:arstechnica.com/civis/viewtopic.php%3Ft%3D472903+&amp;cd=7&amp;hl=en&amp;ct=clnk&amp;gl=ro" onmousedown="return rwt(this,'','','','7','AFQjCNFlNrbGfSbSV-iPPPi8IxcUqzktPw','EbLNu4w0MQUwTey84xbfNA','0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQIAhKMAY','','',event)">Cached</a>
                </li>
              </ul>
          </div>
        </div>
      </div>
      <div class="f slp">Jul 19, 2004 - 40 posts - ‎23 authors</div><span class="st">objResultsFile.WriteLine &quot;&lt;TR&gt;&lt;TH COLSPAN=&#39;<em>2</em>&#39;&gt;Click Server Name For Detailed &quot; _ ... REM - It is recommended to <em>test</em> in a lab first. REM</span>
    </div>
  </div>
</div>
<!--n-->
</div>
<div class="g">
  <!--m-->
  <div class="rc" data-hveid="76">
    <h3 class="r"><a href="http://arstechnica.com/civis/viewtopic.php?f=17&amp;t=624507" onmousedown="return rwt(this,'','','','8','AFQjCNHHo2dqNwuxy1E_YStDv0yd6DBYxw','H1Hu77PQXfgkYXNDDEfH6w','0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQFghNMAc','','',event)">Is it possible to get a page when a process is started - Ars ...</a></h3>
    <div class="s">
      <div>
        <div class="f kv _SWb" style="white-space:nowrap"><cite class="_Rm">arstechnica.com/civis/viewtopic.php?f=17&amp;t=624507</cite>
          <div class="action-menu ab_ctl"><a class="_Fmb ab_button" href="#" id="am-b7" aria-label="Result details" aria-expanded="false" aria-haspopup="true" role="button" jsaction="m.tdd;keydown:m.hbke;keypress:m.mskpe" data-ved="0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQ7B0ITjAH"><span class="mn-dwn-arw"></span></a>
            <div
            class="action-menu-panel ab_dropdown" role="menu" tabindex="-1" jsaction="keydown:m.hdke;mouseover:m.hdhne;mouseout:m.hdhue" data-ved="0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQqR8ITzAH">
              <ul>
                <li class="action-menu-item ab_dropdownitem" role="menuitem"><a class="fl" href="http://webcache.googleusercontent.com/search?q=cache:PyKSd_PhIsoJ:arstechnica.com/civis/viewtopic.php%3Ff%3D17%26t%3D624507+&amp;cd=8&amp;hl=en&amp;ct=clnk&amp;gl=ro" onmousedown="return rwt(this,'','','','8','AFQjCNG4l4bXQwnIsp196jLQbxNEql_zKQ','JZ-s6VvMPwHVE6iqZBs2HQ','0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQIAhQMAc','','',event)">Cached</a>
                </li>
              </ul>
          </div>
        </div>
      </div>
      <div class="f slp">24 posts - ‎8 authors</div><span class="st">Posted: Thu Oct 02, 2003 <em>2</em>:17 pm ... &quot;the <em>dork</em> formerly known as Spiff&quot; .... Also, in my brief <em>test</em> it would only send a message once and not constantly while the&nbsp;...</span>
    </div>
  </div>
</div>
<!--n-->
</div>
<div class="g">
  <!--m-->
  <div class="rc" data-hveid="82">
    <h3 class="r"><a href="http://www.academia.edu/8138550/Psychology" onmousedown="return rwt(this,'','','','9','AFQjCNHW_KCgXPbej6lhLHNfyuTt_pWs4Q','-RghSDshtqCsIIRH2ZEUbg','0ahUKEwj3jpmq_YHLAhWFVSwKHUTjBAMQFghTMAg','','',event)">Psychology | Akinbode Olanike - Academia.edu</a></h3>
    <div class="s">
      <div>

很抱歉这个巨大的消息来源。无论如何,我想捕获

之间的所有链接
<h3 class="r"><a href="

" onmousedown="return rwt

有人可以帮助我使用正则表达式吗?

1 个答案:

答案 0 :(得分:1)

这应该可以捕获href attribure中的所有字符串:href="([^"]+)"