我想解析表单所在的页面,该表单包含令牌输入 需要使用此令牌值并随我的输入一起发送
这是我在添加令牌输入之前使用的curl代码
$username = @$_POST['user'];
$password = @$_POST['password'];
$to = @$_POST['to'];
$text = @$_POST['text'];
$loginUrl = '';
$sendUrl = '';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $loginUrl);
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/32.0.1700.107 Chrome/32.0.1700.107 Safari/537.36');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, "user=$username&password=$password");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie-name');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch)."/n";
}
//sending
curl_setopt($ch, CURLOPT_URL, $sendUrl);
curl_setopt($ch, CURLOPT_POSTFIELDS, "recipients=$to&message_body=$text");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie-name-send');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie-send.txt');
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch)."/n";
}
echo $answer;
这是我要解析的页面
<form name="user_action" method="post" action="index.php?page=11&lang=ge">
<input type="hidden" name="csrf_token" value="7e71ea58eaaa55986b0fdc71b2d44c92">
<input type="text" id="user" name="user" class="round_border medium_box">
<input type="password" id="password" name="password" class="round_border medium_box">
<input type="submit" value="შესვლა" class="btn red_btn round_border medium">
</form>
没有,如果没有此标记<input type="hidden" name="csrf_token" value="7e71ea58eaaa55986b0fdc71b2d44c92">
我需要首先解析此页面以获取令牌,并同时发送带有此令牌的帖子
答案 0 :(得分:1)
签出DOMDocument来解析HTML。 https://www.php.net/manual/en/class.domdocument.php
这是我要尝试的:
<?php
$page = file_get_contents("https://wherever-the-form-is.com");
$dom = new DOMDocument();
$dom->loadHTML( $page );
// Get a list of all inputs
$inputs = $dom->getElementsByTagName( 'input' );
$total = $inputs->length;
$token = false;
// Loop through inputs looking for one with the right name
for( $i = 0; $i < $total; $i++ ) {
if ( $inputs->item($i)->getAttribute('name') == 'csrf_token' ) {
// When you find the right name, record the value and break out of the loop
$token = $inputs->item($i)->getAttribute('value');
break;
}
}
if ( $token ) {
// Your code here
}
答案 1 :(得分:0)
@Stevish是正确的,您应该使用DOMDocument,但是我建议使用一种略有不同的方法:而是循环使用表单的所有输入子级,例如
$domd=@DOMDocument::loadHTML($answer);
$xp=new DOMXPath($domd);
$inputs=array();
foreach($xp->query("//form[@name='user_action']//input") as $input){
$inputs[$input->getAttribute("name")]=$input->getAttribute("value");
}
应该让您拥有
$inputs=array (
'csrf_token' => '7e71ea58eaaa55986b0fdc71b2d44c92',
'user' => '',
'password' => '',
'' => 'á¨áá¡ááá',
)
..您也在此处编码$ to或$ message,如果$ message包含blabla&to=moreblabla
,您会怎么办?它将覆盖您之前的$ to并使$ to变量无关紧要,您需要对该狗屎进行url编码,所以要么
curl_setopt($ch, CURLOPT_POSTFIELDS, "recipients=".urlencode($to)."&message_body=".urlencode($text));
或更妙的是,使用http_build_query
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query(
array(
"recipients" => $to,
"message_body" => $text
)
));
...甚至更好,
$domd=@DOMDocument::loadHTML($answer);
$xp=new DOMXPath($domd);
$inputs=array();
foreach($xp->query("//form[@name='user_action']//input") as $input){
$inputs[$input->getAttribute("name")]=$input->getAttribute("value");
}
$inputs["recipients"]=$to;
$inputs["message_body"]=$text;
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($inputs));