这是我拥有的数组
var_dump($arr);
// prints below array
[0] => Array
(
[title] => Lee Daniels' The Butler (2013)
)
我想删除括号中的年份,并用下划线(“_”)替换空格(“”),然后对其进行urlencode。因此,期望的输出是
Lee_Daniels%27_The_Butler
这是我的代码:
$url = preg_replace('/\((\d){4}\)/', '', $arr[0]['title']);
$title = str_replace(" ","_", trim($url));
$title = urlencode($title); // tried with urlencode(addslashes($title));
echo $title; // prints Lee_Daniels'_The_Butler
我知道echo urlencode('\'')给出“%27”,因此尝试使用addslashes但无济于事。
更新 它适用于
preg_replace('/\((\d){4}\)/', '', "Lee Daniels' The Butler (2013)");
但是如果你直接获取str如下:
include_once('simple_html_dom.php');
$url = 'http://www.imdb.com/chart/';
$main_content = file_get_html($url);
$table = $main_content->find('table', 0);
$tbody = $table->find('tbody', 0);
$trs = $tbody->find('tr');
foreach ($trs as $tr) {
$tds = $tr->find('td');
$movies = "";
$movies['title'] = trim($tds[2]->plaintext);
$arr[] = $movies;
}
$url = preg_replace('/\((\d){4}\)/', '', $arr[0]['title']);
$title = str_replace(" ","_", trim($url));
$title = urlencode($title);
echo $title;
要复制这个,请在php中包含简单的html dom解析器。
有人可以指出我所缺少的东西吗?
答案 0 :(得分:0)
以下是工作代码:
include_once('simple_html_dom.php');
$url = 'http://www.imdb.com/chart/';
$main_content = file_get_html($url);
$table = $main_content->find('table', 0);
$tbody = $table->find('tbody', 0);
$trs = $tbody->find('tr');
foreach ($trs as $tr) {
$tds = $tr->find('td');
$movies = "";
$movies['title'] = trim($tds[2]->plaintext);
$arr[] = $movies;
}
$title = html_entity_decode($arr[0]['title'], ENT_QUOTES, 'UTF-8');
$title = trim(preg_replace('/\((\d){4}\)/', '', $title));
$title = str_replace(" ", "_", $title);
$title = urlencode($title);
echo $title;
请注意,屏幕抓取违反了here提及的IMDB条件。这仅用于学习目的。