如何我preg_match_all以“http”开头,以(“)或(')或空格(制表符,空格,换行符)结尾

时间:2011-07-11 12:36:54

标签: php preg-match-all

如何在正则表达式中写入preg_match_all以“http”(不含引号)开头并以(“)或(')或空格(制表符,空格,换行符)结尾

我想preg_match_all所有以“http”

开头的部分
Wupload
http://www.wupload.com/file/CCCCCCC/NNIW-LiBRARY.part1.rar
http://www.wupload.com/file/VVVVVVVV/NNIW-LiBRARY.part2.rar
http://www.wupload.com/file/TTTTTTT/NNIW-LiBRARY.part3.rar

Fileserve
http://www.fileserve.com/file/WWWW/NNIW-LiBRARY.part1.rar
http://www.fileserve.com/file/TTTTT/NNIW-LiBRARY.part2.rar
http://www.fileserve.com/file/RRRRR/NNIW-LiBRARY.part3.rar

Uploaded.To
http://ul.to/AAAA/NNIW-LiBRARY.part1.rar
http://ul.to/BBBBB/NNIW-LiBRARY.part2.rar
http://ul.to/YYYYYY/NNIW-LiBRARY.part3.rar

结果必须像这样 http://www.wupload.com/file/CCCCCCC/NNIW-LiBRARY.part1.rar
http://www.wupload.com/file/VVVVVVVV/NNIW-LiBRARY.part2.rar
http://www.wupload.com/file/TTTTTTT/NNIW-LiBRARY.part3.rar
http://www.fileserve.com/file/WWWW/NNIW-LiBRARY.part1.rar
http://www.fileserve.com/file/TTTTT/NNIW-LiBRARY.part2.rar
http://www.fileserve.com/file/RRRRR/NNIW-LiBRARY.part3.rar
http://ul.to/AAAA/NNIW-LiBRARY.part1.rar
http://ul.to/BBBBB/NNIW-LiBRARY.part2.rar
http://ul.to/YYYYYY/NNIW-LiBRARY.part3.rar

3 个答案:

答案 0 :(得分:2)

我建议您使用parse_url来获取部分网址! 看一下php.net

编辑:

$file = file_get_contents( YOUR FILE NAME );
$lines = explode("\r\n", $file);
foreach( $lines as $line ){
$urlParts = parse_url( $line );
if( $urlParts['scheme'] == 'http' ){
 // Do anything ...
}
}

更改:

oOk,我不知道你的代码是什么!如果你想抓取html找到我建议的链接,它会返回一个标签的href值:

preg_match_all ( "/<[ ]{0,}a[ \n\r][^<>]{0,}(?<= |\n|\r)(?:href)[ \n\r]{0,}=[ \n\r]{0,}[\"|']{0,1}([^\"'>< ]{0,})[^<>]{0,}>((?:(?!<[ \n\r]*\/a[ \n\r]*>).)*)<[ \n\r]*\/a[ \n\r]*>/ is", $source, $regs );

for ( $x = 0; $x < count ( $regs [ 1 ] ); $x ++ ) {
$tmp_array [ "link_raw" ] = trim ( $regs [ 1 ] [ $x ] );
}

然后使用parse_url检查thoes

答案 1 :(得分:0)

您是说要删除“Wupload”,“Fileserve”和“Uploaded.To”标题并仅捕获数组中的URL?如果是这样,请尝试以下操作:

preg_match_all('!^http://.*\n!m', $string, $matches);
echo "<pre>" . print_r($matches, 1) . "</pre>";

答案 2 :(得分:0)

这应该做你需要的:

<?php
$matches = array();
preg_match_all('@https?://([-\w\.]+)+(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)?@', $string, $matches);
foreach ($matches[0] as $match) {
    // Do your processing here.
}
?>