如何从外部网站和标题页获取元
如何从外部网站获取元
<input type="text"value="http://"id="externalurl" />
答案 0 :(得分:2)
我个人会使用PHP而不是Javascript来解决这个问题。如果javascript确实是必要的,那么你可以在AJAX页面上进行AJAX。我将从使用这个PHP库“http://sourceforge.net/projects/simplehtmldom/”开始 然后按照以下方式做一些事情:
// Create DOM from URL or file
$url = 'http://www.example.com/';
$html = file_get_html($url);
// Find all meta tags
foreach($html->find('meta') as $element){
$temp['name'] = $element->name;
$temp['content'] = $element->content;
$meta[] = $temp;
}
//Run checks on the array of meta tags or whatever you are trying to acheive
我没有检查过这个,因为我很狡猾,但我看到了这个问题并立刻想到了这个图书馆!希望它有所帮助
测试后编辑: 经过一段时间的游戏,这段代码:
<?php
include('simple_html_dom.php');
// Create DOM from URL or file
$url = 'http://www.amazon.com/';
$html = file_get_html($url);
// Find all meta tags
foreach($html->find('meta') as $element){
$temp['name'] = $element->name;
$temp['content'] = $element->content;
$temp['charset'] = $element->charset;
$meta[] = $temp;
$temp = "";
}
print_r($meta);
?>
输出:
Array
(
[0] => Array
(
[name] =>
[content] => on
[charset] =>
)
[1] => Array
(
[name] =>
[content] => text/html; charset=iso-8859-1
[charset] =>
)
[2] => Array
(
[name] => description
[content] => Online shopping from the earth's biggest selection of books, magazines, music, DVDs, videos, electronics, computers, software, apparel & accessories, shoes, jewelry, tools & hardware, housewares, furniture, sporting goods, beauty & personal care, broadband & dsl, gourmet food & just about anything else.
[charset] =>
)
[3] => Array
(
[name] => keywords
[content] => Amazon, Amazon.com, Books, Online Shopping, Book Store, Magazine, Subscription, Music, CDs, DVDs, Videos, Electronics, Video Games, Computers, Cell Phones, Toys, Games, Apparel, Accessories, Shoes, Jewelry, Watches, Office Products, Sports & Outdoors, Sporting Goods, Baby Products, Health, Personal Care, Beauty, Home, Garden, Bed & Bath, Furniture, Tools, Hardware, Vacuums, Outdoor Living, Automotive Parts, Pet Supplies, Broadband, DSL
[charset] =>
)
[4] => Array
(
[name] => google-site-verification
[content] => 9vpzZueNucS8hPqoGpZ5r10Nr2_sLMRG3AnDtNlucc4
[charset] =>
)
)
这似乎几乎所有东西!