正则表达式过滤验证码图像src

时间:2016-07-26 09:49:48

标签: php regex

我正在尝试解析一些重新渲染图像的卷曲响应,遗憾的是我的正则表达式体验不是那么好,我如何使用正则表达式找到此代码中的src链接?

<img id="recaptcha_challenge_image" alt="reCAPTCHA-Bild" height="57" width="300" src="https://www.google.com/recaptcha/api/image?c=03AHJ_VuvfHOCmjv_2G6NuZOIpbC-S7FGqqg-WM4ytRKoMD7oeXGHV_fvqim2YJSw3qkD7XcSxSlCOLAeBd4vy0S7KTrLiL7xSwc2YvGnc5Q9EUXllGa-56zXarU7Pq4btSzHn0anQPToXtBgDBdu69P-SwUpnzs1q08w9GTL8TEtpMAYYcW1hAXqpQEnPQO_h5tmtEDyr9WlgvVVTNcMwDsTJOnjsGmH4kg&amp;th=,skNXQ2Kw6Nd0dqSGIn-2Ei0m-GKXNcjwAAAAcaAAAADlawORO7hNRBFlts00ZWNVrY7qTydBWIpzMqbUQsFjKKE1a8QhNJQqFgffS8OED15Kp3u0DRvaI7SCeuXwN6hkhGhMqCqgpe9cY2VyxbkFK8pRpnNcz5gnlz43NZsAyBZmvlX-BmLj6Hgq-6DzW0pWn3WFpGmM2UhRNudpy91Vyg7OwtD0xNhMGubBALOcAmR35Em3TsWYVh9mCL37V0nJ2xEBjjBQqCtn1Yoz5fLgxd3xp7229BR1zIremO1wJMCVljmT5uxYHyArQsN5PYqF-C7u4NkOSKS6PmkRTKAdV1NihEGAKxREfCsttZbVOY7MOVBF4zemvUDwIwA8pMJrnbbwwwXEinA2-w1NNS0tCxrubtqKCxu2EQsHRQ37V5NzI-Y_0qYMdxDa3wuXdS2ojVsjwXzVnBbYnG7LFJ1BJlZev4lugPtZh2siHolIbmHT8L1z3MMk0DEH0QnhNkd9x6e66tRyqYs7PsoteXmqa76sbqb945WtI5jsiJY4wo9yuKtGH03HmxdqhftgOk9OM6Gjjvhu1lxfW8tkOhehrGD5Td7z0L5fywtXmexRSlEQ5B4_OA3LEVmoCMUjW18GXDBj_lZPjAQ-mp6zV4a19ht88ilWfFTanLZ6d9FKsRrdwlNIS6cDzVBKT90mgXKARhibHrrSXujgo1l-gDbJ0o6xJBqSIugP155OVvwhJHW_ofOnvBuxgbvvsvOfskyGcFdnPoBIwrK-47AHx4H2jryUbCc3wLAtOcUicS_I2PRxKSUuUmYUk-bQq00scg0mDoI6QlD12pkPvmNA_QDyPqKjv5z8fc5HLVIAqFdBFdbWImHFKku0clxNX_qebl7r-C7e7LNBTngIFRdtFzAX_VjZHqRouemq2y89UA30WP65JSzzbUPt-z-tb6eKW3QD0eOlm28YkbYib9mdl85bIy61rS8bCHtFuKlcTMSzZyqMJhH25faKCTPkkXHhkPnO7IkMEmyll3LA5kjkc9RwTWFgF64RqLC-BqLscVi0GbVcCodMSVy1-kRGRqPr2ZaMwbLJSJq94Dy7reaex9rgiWEfpM0jEj_b1UeGUAEENhcPM3N63bPF3_F39H1YX3oBve4UXURVo7JkU2-C6o1HmB7Xr74JMEpPl8Vj1zImRk7SSB8Z6KEGv8Nj2f2Pq0hgaiehokt9I1JpnFprXtlEQW3vvgDZa01jYb6kmKVJNdaq0mIuvg">

我的实际代码:

<?php
    class get_recaptcha {
        function __construct(){
            $this->website = 'http://registration.zwinky.com/registration/register.jhtml';
        }

        function curl_post(){
            $ch = curl_init();
            $timeout = 5;
            curl_setopt($ch, CURLOPT_URL, $this->website);
            curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
            curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
            $data = curl_exec($ch);
            curl_close($ch);
            return $data;
        }

        function parse_content(){
            preg_match('regexFilter', $this->curl_post(), $match); 
            $captcha_image = $match[1];
            return $captcha_image;
        }
    }

    $get_captcha = new get_recaptcha();
    $captcha_response = $get_captcha->parse_content();
    echo $captcha_response;
?>

有人可以帮我设置正则表达式过滤器吗?

1 个答案:

答案 0 :(得分:1)

似乎验证码在iframe中,因此您需要使用我为iframe获取验证码的网址的网址。虽然不是RegEx解决方案,但它确实可以获得所需的信息。如果iframe的网址发生了变化,那么您需要使用代码来获取iframe src网址 - 希望它有所帮助。

$url='http://www.google.com/recaptcha/api/noscript?k=6Ldx8OgSAAAAAOQu76OwUC1XwCxpEZU576k0gHIR';
$html=file_get_contents( $url );

$dom=new DOMDocument;
$dom->loadHTML($html);

$col=$dom->getElementsByTagName('img');
foreach($col as $n)echo $n->getAttribute('src');

echo $src;