我使用guzzle POST方法获取URL。它的工作和返回我想要的页面。但问题是当我想在该页面中的表单中获取输入元素的值时,爬虫不会返回任何内容。我不知道为什么。
PHP:
<?php
use Symfony\Component\DomCrawler\Crawler;
use Guzzle\Http\Client;
$client = new Client();
$request = $client->get("https://example.com");
$response = $request->send();
$getRequest = $response->getBody();
$cookie = $response->getHeader("Set-Cookie");
$request = $client->post('https://example.com/page_example.php', array(
'Content-Type' => 'application/x-www-form-urlencoded',
'Cookie' => $cookie
), array(
'param1' => 5,
'param2' => 10,
'param3' => 20
));
$response = $request->send();
$pageHTML = $response->getBody();
//fetch orderID
$crawler = new Crawler($pageHTML);
$orderID = $crawler->filter("input[name=orderId]")->attr('value');//there is only one element with this name
echo $orderID; //returns nothing
我该怎么办?
答案 0 :(得分:1)
您不必创建Crawler:
$crawler = $client->post('https://example.com/page_example.php', array(
'Content-Type' => 'application/x-www-form-urlencoded',
'Cookie' => $cookie
), array(
'param1' => 5,
'param2' => 10,
'param3' => 20
));
$orderID = $crawler->filter("input[name=orderId]")->attr('value');
这假设您的POST没有被重定向,如果它被重定向,您应该在调用过滤器函数之前添加:
$this->assertTrue($client->getResponse()->isRedirect());
$crawler = $client->followRedirect();