正则表达式找出html页面的特定部分

时间:2014-01-09 16:03:36

标签: php regex

我希望正则表达式从一组代码中找出以下行。 我想找到的部分:---

- >复制框架链接\“,\”url240 \“:\”http:\ / \ / cs534515v4.vk.me \ / u163220668 \ / videos \ /1c1b06aec9.240.mp4 \“,\” url360 \ “:\” HTTP:\ / \ / cs534515v4.vk.me \ / u163220668 \ /视频\ /1c1b06aec9.360.mp4 \”,\ “JPG \” < -

此代码构成一个部分,如果一个html页面,我想只检索显示的部分。我在php中编写代码

我的完整代码.....

<?php

set_time_limit(0);
function get_content_of_url($url){
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
    $content = curl_exec($ch);
    curl_close($ch);
    return $content;
}

$plyst  =   get_content_of_url("http://vk.com/video56612186_167113956");
preg_match('/link\\".*"jpg\\"/', $plyst , $matches);        


var_dump($matches);
//preg_match('/http:\/\/[a-zA-Z0-9\\/-_.]+/', $matches[0][0], $id);
//start_script($id[0]);

?>

1 个答案:

答案 0 :(得分:0)

这个怎么样。

$str = "video_get_current_url\":\"Copy frame link\",\"url240\":\"http:\\\/\\\/cs534515v4.vk.me\\\/u163220668\\\/videos\\\/1c1b06aec9.24‌​0.mp4\",\"url360\":\"http:\\\/\\\/cs534515v4.vk.me\\\/u163220668\\\/videos\\\/1c1b06aec9.36‌​0.mp4\",\"jpg\":\"http:\\\/\\\/cs534515.vk.me\\\/u163220668\\\/video\\\/l_8a5b0712.jpg\",\"‌​ip_subm\":1,\"nologo";

preg_match('/\\"Copy\sframe.*"jpg\\"/is', $str, $matches);

var_dump($matches);

输出:

array(1) {
  [0]=>
  string(199) ""Copy frame link","url240":"http:\\/\\/cs534515v4.vk.me\\/u163220668\\/videos\\/1c1b06aec9.24‌​0.mp4","url360":"http:\\/\\/cs534515v4.vk.me\\/u163220668\\/videos\\/1c1b06aec9.36‌​0.mp4","jpg""
}

修改

然后,如果你想从中提取视频网址:

preg_match_all('/(https?:.*?\.mp4)/', $matches[0], $id);

//Then echo out the url's
foreach ($id[0] as $url) {
    // the preg_replace strips out the double backslashes.
    echo preg_replace('/\\\\/', '', $url)."<br />";
} 

输出:

http://cs534515v4.vk.me/u163220668/videos/1c1b06aec9.24‌​0.mp4

http://cs534515v4.vk.me/u163220668/videos/1c1b06aec9.36‌​0.mp4

工作示例:http://sandbox.onlinephpfunctions.com/code/329106d990fe8927a7670b9448770643afbd0865