Question

在这里写一个小功能，需要一些优化帮助！

所有请求都重定向到索引页

我有这个函数将url解析为数组。

网址的类型描述为：

http://localhost/{user}/{page}/?sub_page={sub_page}&action={action}

所以一个例子是：

http://localhost/admin/stock/?sub_page=products&action=add

当请求uri时，域被排除，所以我的函数接受如下字符串：

/admin/stock/?sub_page=products&action=add

我的功能如下，警告，这是非常程序化的。

对于那些无法阅读和理解它的人，我在底部添加了一个解释;）

function uri_to_array($uri){
    // uri will be in format: /{user}/{page}/?sub_page={subpage}&action={action} ... && plus additional parameters

    // define array that will be returned
    $return_uri_array = array();

    // separate path from querystring;
    $array_tmp_uri = explode("?", $uri);

    // if explode returns the same as input $string, no delimeter was found
    if ($uri == $array_tmp_uri[0]){ 

        // no question mark found.
        // format either '/{user}/{page}/' or '/{user}/'
        $uri = trim($array_tmp_uri[0], "/");

        // remove excess baggage
        unset ($array_tmp_uri);

        // format either '{user}/{page}' or '{user}'
        $array_uri = explode("/", $uri);

        // if explode returns the same as input $string, no delimiter was found
        if ($uri == $array_uri[0]){
            // no {page} defined, just user.
            $return_uri_array["user"] = $array_uri[0];
        }
        else{
            // {user} and {page} defined.
            $return_uri_array["user"] = $array_uri[0];
            $return_uri_array["page"] = $array_uri[1];            
        }
    }
    else{

        // query string is defined
        // format either '/{user}/{page}/' or '/{user}/'
        $uri = trim($array_tmp_uri[0], "/");
        $parameters = trim($array_tmp_uri[1]);

        // PARSE PATH
        // remove excess baggage
        unset ($array_tmp_uri);

        // format either '{user}/{page}' or '{user}'
        $array_uri = explode("/", $uri);

        // if explode returns the same as input $string, no delimiter was found
        if ($uri == $array_uri[0]){
            // no {page} defined, just user.
            $return_uri_array["user"] = $array_uri[0];
        }
        else{
            // {user} and {page} defined.
            $return_uri_array["user"] = $array_uri[0];
            $return_uri_array["page"] = $array_uri[1];            
        }

        // parse parameter string
        $parameter_array = array();
        parse_str($parameters, $parameter_array);

        // copy parameter array into return array
        foreach ($parameter_array as $key => $value){
            $return_uri_array[$key] = $value;
        }
    }
    return $return_uri_array;
}

基本上有一个主if语句，一个路径是没有定义查询字符串（没有'？'），另一个路径是'？'确实存在。

我只是想让这个功能变得更好。

是否值得上课？

基本上我需要一个以/{user}/{page}/?sub_page={sub_page}&action={action}为参数并返回

的函数

array(
    "user" => {user},
    "page" => {page},
    "sub_page" => {sub_page},
    "action" => {action}
)

干杯，亚历克斯

Answer 1

使这项功能更好的一些建议。

首先，使用parse_url而不是explode来分隔主机名，路径和查询字符串。

其次，在决定是否有查询字符串之前，先将代码解析为解析路径，因为你要么解析路径。

第三，不要使用foreach循环复制参数，而是使用array_merge，如下所示：

// put $return_uri_array last so $parameter_array can't override values
$return_uri_array = array_merge($parameter_array, $return_uri_array);

如果这应该是一个课程，取决于你的编程风格。作为一般规则，我总是使用类，因为在单元测试中更容易模拟它们。

最紧凑的方式是这样的正则表达式（未完全测试，只是为了显示原理）

if(preg_match('!http://localhost/(?P<user>\w+)(?:/(?P<page>\w+))/(?:\?sub_page=(?P<sub_page>\w+)&action=(?P<action>\w+))!', $uri, $matches)) {
  return $matches;
}

结果数组也将包含匹配项的数字索引，但您可以忽略它们或使用array_intersect_keys过滤所需的键。 \w+模式匹配所有“单词”字符，您可以将其替换为[-a-zA-Z0-9_]或类似的字符类。

Answer 2

这个mabye？

function uri_to_array($uri){
  $result = array();

  parse_str(substr($uri, strpos($uri, '?') + 1), $result);
  list($result['user'], $result['page']) = explode('/', trim($uri, '/'));

  return $result;
}

print_r(
  uri_to_array('/admin/stock/?sub_page=products&action=add')
);

/*
Array
(
    [sub_page] => products
    [action] => add
    [page] => stock
    [user] => admin
)
*/

演示：http://codepad.org/nBCj38zT

Answer 3

如果你想

做得好
使用正则表达式
使用相同的方法解析所有网址：s（parse_url()不支持相对路径，下面称为only_path）

这可能适合您的口味：

$url = 'http://localhost/admin/stock/?sub_page=products&action=add';
preg_match ("!^((?P<scheme>[a-zA-Z][a-zA-Z\d+-.]*):)?(((//(((?P<credentials>([a-zA-Z\d\-._~\!$&'()*+,;=%]*)(:([a-zA-Z\d\-._~\!$&'()*+,;=:%]*))?)@)?(?P<host>([\w\d-.%]+)|(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(\[([a-fA-F\d.:]+)\]))?(:(?P<port>\d*))?))(?<path>(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*))|(?P<only_path>(/(([a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*))?)|([a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*)))?(?P<query>\?([a-zA-Z\d\-._~\!$&'()*+,;=:@%/?]*))?(?P<fragment>#([a-zA-Z\d\-._~\!$&'()*+,;=:@%/?]*))?$!u", $url, $matches);
$parts = array_intersect_key ($matches, array ('scheme' => '', 'credentials' => '', 'host' => '', 'port' => '', 'path' => '', 'query' => '', 'fragment' => '', 'only_path' => '', ));
var_dump ($parts);

它应该涵盖所有可能的格式良好的URL：s

如果host为空，则only_path应该包含path，即protocol - 更少且host - 更少的网址。

<强>更新

也许我应该更好地阅读这个问题。这会将URL解析为可用于更轻松地获取您真正感兴趣的部分的组件。运行类似：

// split the URL
preg_match ('!^((?P<scheme>[a-zA-Z][a-zA-Z\d+-.]*):)?(((//(((?P<credentials>([a-zA-Z\d\-._~\!$&'()*+,;=%]*)(:([a-zA-Z\d\-._~\!$&'()*+,;=:%]*))?)@)?(?P<host>([\w\d-.%]+)|(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(\[([a-fA-F\d.:]+)\]))?(:(?P<port>\d*))?))(?<path>(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*))|(?P<only_path>(/(([a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*))?)|([a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*)))?(\?(?P<query>([a-zA-Z\d\-._~\!$&'()*+,;=:@%/?]*)))?(#(?P<fragment>([a-zA-Z\d\-._~\!$&'()*+,;=:@%/?]*)))?$!u', $url, $matches);
$parts = array_intersect_key ($matches, array ('scheme' => '', 'credentials' => '', 'host' => '', 'port' => '', 'path' => '', 'query' => '', 'fragment' => '', 'only_path' => '', ));

// extract the user and page
preg_match ('!/*(?P<user>.*)/(?P<page>.*)/!u', $parts['path'], $matches);
$user_and_page = array_intersect_key ($matches, array ('user' => '', 'page' => '', ));

// the query string stuff
$query = array ();
parse_str ($parts['query'], $query);

<强>参考：

为了澄清，以下是用于制定正则表达式的相关文件：

RFC3986 scheme / protocol
RFC3986 用户和密码
RFC1035 主机名
- 或 RFC3986 IPv4
- 或 RFC2732 IPv6
RFC3986 查询
RFC3986 片段

将url字符串（路径和参数）解析为数组

3 个答案: