使用php从搜索词中获取imdb海报图片

时间:2012-05-10 15:11:23

标签: php imdb

我想从搜索词中使用来自imdb的php获取海报图片网址。例如,我有搜索词21 Jump Street,我想取回你的图像或只有imdb电影网址。使用以下代码,我只需要从搜索词中检索电影的网址

这是我的代码

<?php

    include("simple_html_dom.php");

//url to imdb page
$url = 'hereistheurliwanttogetfromsearch';

//get the page content
$imdb_content = file_get_contents($url);

$html = str_get_html($imdb_content);

$name = $html->find('title',0)->plaintext;

$director = $html->find('a[itemprop="director"]',0)->innertext;

$plot = $html->find('p[itemprop="description"]',0)->innertext;

$release_date = $html->find('time[itemprop="datePublished"]',0)->innertext;

$mpaa = $html->find('span[itemprop="contentRating"]',0)->innertext;

$run_time = $html->find('time[itemprop="duration"]',0)->innertext;

$img = $html->find('img[itemprop="image"]',0)->src;

$content = "";

//build content
$content.= '<h2>Film</h2><p>'.$name.'</p>';
$content.= '<h2>Director</h2><p>'.$director.'</p>';
$content.= '<h2>Plot</h2><p>'.$plot.'</p>';
$content.= '<h2>Release Date</h2><p>'.$release_date.'</p>';
$content.= '<h2>MPAA</h2><p>'.$mpaa.'</p>';
$content.= '<h2>Run Time</h2><p>'.$run_time.'</p>';
$content.= '<h2>Full Details</h2><p><a href="'.$url.'" rel="nofollow">'.$url.'</a></p>';
$content.= '<img src="'.$img.'" />';

echo $content;

?>

2 个答案:

答案 0 :(得分:2)

使用Kasper Mackenhauer Jacobsenless建议的API,这是一个更全面的答案:

$url = 'http://www.imdbapi.com/?i=&t=21+jump+street';

$json_response = file_get_contents($url);
$object_response = json_decode($json_response);

if(!is_null($object_response) && isset($object_response->Poster)) {
        $poster_url = $object_response->Poster;
        echo $poster_url."\n";
}

答案 1 :(得分:0)

使用正则表达式进行解析很糟糕,但这种情况很少会破坏。建议使用卷曲更快,你可以掩盖你的紧急情况。

从搜索中获取图像的主要问题是,您首先需要知道IMDB ID,然后才能加载页面并翻录图像网址。希望它有所帮助

<?php 
//Is form posted
if($_SERVER['REQUEST_METHOD']=='POST'){
    $find = $_POST['find'];

    //Get Imdb code from search
    $source = file_get_curl('http://www.imdb.com/find?q='.urlencode(strtolower($find)).'&s=tt');
    if(preg_match('#/title/(.*?)/mediaindex#',$source,$match)){
        //Get main page for imdb id
        $source = file_get_curl('http://www.imdb.com/title/'.$match[1]);
        //Grab the first .jpg image, which is always the main poster
        if(preg_match('#<img src=\"(.*).jpg\"#',$source,$match)){
            $imdb=$match[1];
            //do somthing with image
            echo '<img src="'.$imdb.'" />';
        }
    }
}

//The curl function
function file_get_curl($url){
(function_exists('curl_init')) ? '' : die('cURL Must be installed');
$curl = curl_init();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";

curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0 Firefox/5.0');
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_HEADER, true);
curl_setopt($curl, CURLOPT_REFERER, $url);
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_TIMEOUT, 5);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);

$html = curl_exec($curl);

$status = curl_getinfo($curl);
curl_close($curl);

if($status['http_code'] != 200){
    if($status['http_code'] == 301 || $status['http_code'] == 302) {
        list($header) = explode("\r\n\r\n", $html, 2);
        $matches = array();
        preg_match("/(Location:|URI:)[^(\n)]*/", $header, $matches);
        $url = trim(str_replace($matches[1],"",$matches[0]));
        $url_parsed = parse_url($url);
        return (isset($url_parsed))? file_get_curl($url):'';
    }
    return FALSE;
}else{
    return $html;
}
}
?>
<form method="POST" action="">
<p><input type="text" name="find" size="20"><input type="submit" value="Submit"></p>
</form>