<?php
include('../simple_html_dom.php');
$fname = "http://www.myurl.com";
$html = file_get_html($fname);
$divs = $html->find('h6');
foreach($divs as $element)
{
$title = $element->find('a', 0)->plaintext;
echo $title.'<br>';
}
echo '<br>';
?>
我收到了这个错误:
“无法打开流:HTTP请求失败!HTTP / 1.1 500内部服务器错误.......”
我的网址很长,实际长度为750个字符。 如果我使用wget它显示“文件名太长”
我该如何解决?我需要它来使用简单的dom
答案 0 :(得分:2)
URL长度可以使用750个字符。最常用的实际限制是2000个字符,这是旧IE中的限制。
您应该尝试模拟发出请求的Web浏览器。请参阅this other question。
编辑:将CURL与您的代码一起使用
<?php
// include is not a function, don't use parens (also use require instead)
require '../simple_html_dom.php';
$fname = "http://www.myurl.com";
$agent= 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)';
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
// don't want to polute your output
//curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_URL, $fname);
$result=curl_exec($ch);
$html = new simple_html_dom();
$html->load($result);
$divs = $html->find('h6');
foreach($divs as $element)
{
$title = $element->find('a', 0)->plaintext;
echo $title.'<br>';
}
echo '<br>';
答案 1 :(得分:0)
网址长度很好。该链接可能已损坏或已过期。 我尝试使用下面显示的链接,结果似乎很好:
<?php
include("simple_html_dom.php");
$fname = "http://www.youtubeonfire.com/?genre=0&language=0&next_token=rO0ABXNyACdjb20uYW1hem9uLnNkcy5RdWVyeVByb2Nlc3Nvci5Nb3JlVG9rZW7racXLnINNqwMA%0AC0kAFGluaXRpYWxDb25qdW5jdEluZGV4WgAOaXNQYWdlQm91bmRhcnlKAAxsYXN0RW50aXR5SURa%0AAApscnFFbmFibGVkSQAPcXVlcnlDb21wbGV4aXR5SgATcXVlcnlTdHJpbmdDaGVja3N1bUkACnVu%0AaW9uSW5kZXhaAA11c2VRdWVyeUluZGV4TAANY29uc2lzdGVudExTTnQAEkxqYXZhL2xhbmcvU3Ry%0AaW5nO0wAEmxhc3RBdHRyaWJ1dGVWYWx1ZXEAfgABTAAJc29ydE9yZGVydAAvTGNvbS9hbWF6b24v%0Ac2RzL1F1ZXJ5UHJvY2Vzc29yL1F1ZXJ5JFNvcnRPcmRlcjt4cAAAAAEAAAAAAAABds0AAAAAAQAA%0AAAC71ED7AAAAAAFwdAAQMDAwMDAwMDAwMDAwMjAxM35yAC1jb20uYW1hem9uLnNkcy5RdWVyeVBy%0Ab2Nlc3Nvci5RdWVyeSRTb3J0T3JkZXIAAAAAAAAAABIAAHhyAA5qYXZhLmxhbmcuRW51bQAAAAAA%0AAAAAEgAAeHB0AApERVNDRU5ESU5HeA%3D%3D&sort=2";
$html = file_get_html($fname);
$divs = $html->find("h6");
foreach($divs as $element) {
$title = $element->find("a", 0)->plaintext;
echo($title . "<br />");
}
echo("<br />");
输出:
Spider (2013)
500 MPH STORM 2013 HD
Van Diemans Land (Action,Adventure,20...
Good Agent is A Bad Agent (Full HQ En...
Employee of the Month (Full HQ Englis...
The Croods (2013)
GIRLFRIENDS - 2013
Boys Are Pigs-2013
The Patriot -2013
My Daughter's Secret -2013
Dead on Arrival [2013]
Flght 2013XViD1
Samsung Galaxy S4 Presentation UNPACK...
Affinity 2013
Golden Globe Awards 2013: Full Show
Parker-2013
Hells' Kitchen- New Action Movie 2013
ALIENS [2013]
7 Nights Of Darkness -2013
Hansel And Gretel 2013
The Collection (2012)
Mac And Devin Go To High School 2012
Red Dawn (2012)
Hijacked -2012
Bending The Rules -2012
Inside -2012
VAMPIRELAND-2012
Dead Mine -2012
Devil Seed-2012
Kill Em All -2012
One In The Chamber -2012
The Forger - 2012
Dark Desire -2012
A Common Man -2012 .
The Helpers -2012
Red Dawn- 2012 720p
所以,用URL解决问题,一切都会正常工作!
答案 2 :(得分:0)
您说您的网址在您的浏览器中正常运行,而我们这里的所有人都收到了500错误,就像您的脚本一样。
该站点可能会根据IP以及可能的请求的其他标头检查URL中的令牌。因此,您需要找到一种从PHP脚本中获取标记化URL的方法。
为此,您需要先从PHP脚本下载主页,然后找到下一个链接的URL并在脚本中使用此页面。