Question

我正在尝试解析网址中的两个数字。 URL在这里：

http://movies.actionpaxed.com/5600_5949/5943/5/pics/none/500k/3min/003.jpg?nvb=20130811232301&nva=20130812012301&hash=090a687f7e27b2f5ef735

我正在尝试仅获取网址的“5943/5”部分。我只是解析URL，然后使用str_replace，但我需要的两个文件夹名称各不相同。

到目前为止，我有：

$homepage = file_get_contents($url);
$link = parse_to_string('"video_url":"', '"};', $homepage);
$link = str_replace(array( '"low":"', '"};'), '', $link);
$link = utf8_decode(urldecode($link));

在此代码的末尾，$ link = http://movies.actionpaxed.com/5600_5949/5943/5/pics/none/500k/3min/003.jpg?nvb=20130811232301&nva=20130812012301&hash=090a687f7e27b2f5ef735

任何可以为我处理此问题的正则表达式的帮助，将不胜感激！

Answer 1

怎么样：

$res = explode('/', parse_url($url, PHP_URL_PATH));
$res = $res[2].'/'.$res[3];
echo $res;

<强> Demo!

Answer 2

$exploded = explode("/", $link);
$res = $exploded[4] . "/" . $exploded[5];

echo $res;

Answer 3

preg_match('%https?://.*?/\d*_\d*/(\d*)/(\d*)%',$link,$matches);
print_r($matches);

Answer 4

这是一个提取您正在寻找的内容的函数。

function getTheStuff($url) {

    // Only get the part of the URL that
    // actually matters; this makes the
    // problem smaller and easier to solve
    $path = parse_url($url, PHP_URL_PATH);

    // The path will be false if the URL is
    // malformed, or null if it was not found
    if ($path !== false && $path !== null) {

        // Assuming that the stuff you need is
        // always after the first forward slash,
        // and that the format never changes,
        // it should be easy to match
        preg_match('/^\/[\d_]+\/(\d+\/\d+)/', $path, $result);

        // We only capture one thing so what we
        // are looking for can only be the second
        // thing in the array
        if (isset($result[1])) {
            return $result[1];
        }
    }
    // If it is not in the array then it
    // means that it was not found
    return false;
}
$url = 'http://movies.actionpaxed.com/5600_5949/5943/5/pics/none/500k/3min/003.jpg?nvb=20130811232301&nva=20130812012301&hash=090a687f7e27b2f5ef735';
var_dump(getTheStuff($url));

如果我是为自己写的，那么我会避免使用正则表达式。在这种情况下这是最简单的，所以我用它。我可能通过标记$path（使用/作为分隔符）来推广解决方案，然后让另一个函数/方法/机制处理提取所需的部分。这样就可以更容易地将其用于格式不同的其他URL。

在URL中的斜杠之间进行解析

4 个答案: