PHP函数preg_match()中的正则表达式拼写错误

时间:2018-08-26 01:59:54

标签: php regex curl instagram

这是有关PHP,cURLpreg_match(),RegEx和Instagram的问题。

我正在尝试编写登录到Instagram的cURL PHP代码,然后访问仅对经过验证(登录)的用户可用的内容。 In this question there's an interesting code just for that

我复制粘贴了以下代码:

onCreate

您当然必须将@Query("match (x: Label {name: ?#{#name}}) return x") 的值从<?php $username = "yourname"; $password = "yourpass"; $useragent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/50.0.2661.102 Chrome/50.0.2661.102 Safari/537.36"; $cookie=$username.".txt"; @unlink(dirname(__FILE__)."/".$cookie); $url="https://www.instagram.com/accounts/login/?force_classic_login"; $ch = curl_init(); $arrSetHeaders = array( "User-Agent: $useragent", 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language: en-US,en;q=0.5', 'Accept-Encoding: deflate, br', 'Connection: keep-alive', 'cache-control: max-age=0', ); curl_setopt($ch, CURLOPT_HTTPHEADER, $arrSetHeaders); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__)."/".$cookie); curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__)."/".$cookie); curl_setopt($ch, CURLOPT_USERAGENT, $useragent); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); $page = curl_exec($ch); curl_close($ch); // try to find the actual login form if (!preg_match('/<form method="POST" id="login-form" class="adjacent".*?<\/form>/is', $page, $form)) { die('Failed to find log in form!'); } $form = $form[0]; // find the action of the login form if (!preg_match('/action="([^"]+)"/i', $form, $action)) { die('Failed to find login form url'); } $url2 = $action[1]; // this is our new post url // find all hidden fields which we need to send with our login, this includes security tokens $count = preg_match_all('/<input type="hidden"\s*name="([^"]*)"\s*value="([^"]*)"/i', $form, $hiddenFields); $postFields = array(); // turn the hidden fields into an array for ($i = 0; $i < $count; ++$i) { $postFields[$hiddenFields[1][$i]] = $hiddenFields[2][$i]; } // add our login values $postFields['username'] = $username; $postFields['password'] = $password; $post = ''; // convert to string, this won't work as an array, form will not accept multipart/form-data, only application/x-www-form-urlencoded foreach($postFields as $key => $value) { $post .= $key . '=' . urlencode($value) . '&'; } $post = substr($post, 0, -1); preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $page, $matches); $cookieFileContent = ''; foreach($matches[1] as $item) { $cookieFileContent .= "$item; "; } $cookieFileContent = rtrim($cookieFileContent, '; '); $cookieFileContent = str_replace('sessionid=; ', '', $cookieFileContent); $oldContent = file_get_contents(dirname(__FILE__)."/".$cookie); $oldContArr = explode("\n", $oldContent); if(count($oldContArr)) { foreach($oldContArr as $k => $line) { if(strstr($line, '# ')) { unset($oldContArr[$k]); } } $newContent = implode("\n", $oldContArr); $newContent = trim($newContent, "\n"); file_put_contents( dirname(__FILE__)."/".$cookie, $newContent ); } $arrSetHeaders = array( 'origin: https://www.instagram.com', 'authority: www.instagram.com', 'upgrade-insecure-requests: 1', 'Host: www.instagram.com', "User-Agent: $useragent", 'content-type: application/x-www-form-urlencoded', 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept-Language: en-US,en;q=0.5', 'Accept-Encoding: deflate, br', "Referer: $url", "Cookie: $cookieFileContent", 'Connection: keep-alive', 'cache-control: max-age=0', ); $ch = curl_init(); curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__)."/".$cookie); curl_setopt($ch, CURLOPT_COOKIEFILE, dirname(__FILE__)."/".$cookie); curl_setopt($ch, CURLOPT_USERAGENT, $useragent); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_HTTPHEADER, $arrSetHeaders); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_REFERER, $url); curl_setopt($ch, CURLOPT_POST, true); curl_setopt($ch, CURLOPT_POSTFIELDS, $post); sleep(5); $page = curl_exec($ch); preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $page, $matches); $cookies = array(); foreach($matches[1] as $item) { parse_str($item, $cookie1); $cookies = array_merge($cookies, $cookie1); } var_dump($page); curl_close($ch); ?> 更改为实际的用户名,并将$username的值从yourname更改为相应的密码用户名。另外,如果用户帐户中启用了两步身份验证,则我认为此代码将不起作用(例如,将MMS发送到链接的电话号码,或将电子邮件发送到链接的电子邮件),因此必须在用户的帐户设置中禁用两步验证。

但是,当我在Chrome中运行此PHP脚本时,收到消息无法找到登录表单!,这意味着在第35行,该模式(写为RegEx){{在变量$password中找不到1}},该变量基本上是classic Instagram login page的HTML。

RegEx error in preg_match() for the pattern

找不到这种模式很奇怪,因为确实存在一个名为yourpass的HTML(要自己检查,请转到login page,按Ctrl + U,然后按Ctrl + F,然后然后粘贴上一个模式)。因此,我认为这是因为RegEx拼写错误。正确吗?

I tested it on the page RegEx101.com甚至更奇怪,测试成功。那么,为什么PHP会说找不到该模式呢?

RegEx successful test on RegEx101.com

这个想法是可以在原始问题的代码中找到该模式,以便我可以自动/编程方式登录。请记住,稍后在代码中将使用更多/<form method="POST" id="login-form" class="adjacent".*?<\/form>/is函数,例如。在第42、48($page),70(<form method="POST" id="login-form" class="adjacent")和137(preg_match())行。


编辑:我意识到,如果我在第35行之前添加preg_match_all(),则会得到preg_match_all()(换句话说,HTML输出是preg_match_all())。 我认为这是代码返回消息未能找到登录表单! 的原因。但是,为什么var_dump($page);的值返回该值?

0 个答案:

没有答案