在.NET中解析URL

时间:2013-11-24 01:21:19

标签: c# .net url uri

我正在寻找可以解析URL的.NET Framework类。

需要解析的URL的一些示例:

  • server:8088
  • server:8088/func1
  • server:8088/func1/SubFunc1
  • http://server
  • http://server/func1
  • http://server/func/SubFunc1
  • http://server:8088
  • http://server:8088/func1
  • http://server:8088/func1/SubFunc1
  • magnet://server
  • magnet://server/func1
  • magnet://server/func/SubFunc1
  • magnet://server:8088
  • magnet://server:8088/func1
  • magnet://server:8088/func1/SubFunc1

问题是UriUriBuilder类无法正确处理URL。例如,他们感到困惑:

stackoverflow.com:8088

网址背景

网址格式为:

  foo://example.com:8042/over/there?name=ferret#nose
  \_/   \_________/ \__/\_________/\__________/ \__/
   |         |        |     |           |        |
scheme      host    port   path       query   fragment

在我们的案例中,我们只关心:

  • Uri.Scheme
  • Uri.Host
  • Uri.Port
  • Uri.Path

测试

运行一些测试,我们可以检查UriBuilder类如何处理各种Uri:

                                        Expected  Expected Expected    Expected
//Test URI                               Scheme    Server    Port        Path
//=====================================  ========  ========  ====  ====================
t("server",                              "",       "server", -1,   "");
t("server/func1",                        "",       "server", -1,   "/func1");
t("server/func1/SubFunc1",               "",       "server", -1,   "/func1/SubFunc1");
t("server:8088",                         "",       "server", 8088, "");
t("server:8088/func1",                   "",       "server", 8088, "/func1");
t("server:8088/func1/SubFunc1",          "",       "server", 8088, "/func1/SubFunc1");
t("http://server",                       "http",   "server", -1,   "/func1");
t("http://server/func1",                 "http",   "server", -1,   "/func1");
t("http://server/func/SubFunc1",         "http",   "server", -1,   "/func1/SubFunc1");
t("http://server:8088",                  "http",   "server", 8088, "");
t("http://server:8088/func1",            "http",   "server", 8088, "/func1");
t("http://server:8088/func1/SubFunc1",   "http",   "server", 8088, "/func1/SubFunc1");
t("magnet://server",                     "magnet", "server", -1,   "");
t("magnet://server/func1",               "magnet", "server", -1,   "/func1");
t("magnet://server/func/SubFunc1",       "magnet", "server", -1,   "/func/SubFunc1");
t("magnet://server:8088",                "magnet", "server", 8088, "");
t("magnet://server:8088/func1",          "magnet", "server", 8088, "/func1");
t("magnet://server:8088/func1/SubFunc1", "magnet", "server", 8088, "/func1/SubFunc1");

除了六个案例之外的所有案例都无法正确解析:

Url                                  Scheme  Host    Port  Path
===================================  ======  ======  ====  ===============
server                               http    server  80    /
server/func1                         http    server  80    /func1
server/func1/SubFunc1                http    server  80    /func1/SubFunc1
server:8088                          server          -1    8088
server:8088/func1                    server          -1    8088/func1
server:8088/func1/SubFunc1           server          -1    8088/func1/SubFunc1
http://server                        http    server  80    /
http://server/func1                  http    server  80    /func1
http://server/func/SubFunc1          http    server  80    /func1/SubFunc1
http://server:8088                   http    server  8088  /
http://server:8088/func1             http    server  8088  /func1
http://server:8088/func1/SubFunc1    http    server  8088  /func1/SubFunc1
magnet://server                      magnet  server  -1    /
magnet://server/func1                magnet  server  -1    /func1
magnet://server/func/SubFunc1        magnet  server  -1    /func/SubFunc1
magnet://server:8088                 magnet  server  8088  /
magnet://server:8088/func1           magnet  server  8088  /func1
magnet://server:8088/func1/SubFunc1  magnet  server  8088  /func1/SubFunc1

我说我想要一个.NET Framework类。我也会接受任何代码口香糖,我可以拿起和咀嚼。只要它满足我简单的测试用例。

Bonus Chatter

我正在考虑展开this question,但该问题仅限于http

我也问了这个same question earlier today,但我现在意识到我的说法不正确。我错误地询问如何“构建”一个网址。实际上我想“解析”用户输入的URL。我现在不能回去并从根本上改变标题。所以我会再次提出同样的问题,只有更好,有更明确的目标,在这里。

奖金阅读

1 个答案:

答案 0 :(得分:1)

这个正则表达式会吗?

^((?<schema>[a-z]*)://)?(?<host>[^/:]*)?(:(?<port>[0-9]*))?(?<path>/.*)?$

它并不完美,但它似乎适用于您的测试用例。