Question

我一直在追踪网址重写应用程序的错误。该错误显示为查询字符串中某些变音字符的编码问题。

基本上，问题是基本上是/search.aspx?search=heřmánek的请求被重写了“search = he％c5％99m％c3％a1nek”的查询字符

正确的值（使用一些不同的工作代码）是将查询字符串重写为“search = he％u0159m％u00e1nek”

注意两个字符串之间的区别。但是，如果您同时发布，则会看到Url Encoding会重现相同的字符串。直到你使用编码中断的context.Rewrite函数。破碎的字符串返回'heÅmÃ¡nek'（使用Request.QueryString [“Search”]，工作字符串返回'heřmánek'。这个改变发生在调用重写函数之后。

我使用Request.QueryString（工作）将其追溯到一组代码，另一个使用Request.Url.Query（request.Url返回一个Uri实例）。

虽然我已经解决了这个问题，但我的理解还有一个漏洞，所以如果有人知道其中的差异，我已经准备好上课了。

Answer 1

你的问题确实引起了我的兴趣，所以我在过去一小时左右做了一些阅读。我并不是绝对肯定我找到了答案，但我会把它扔到那里看看你的想法。

从我到目前为止所读到的，Request.QueryString实际上是“ServerVariables集合中的QUERY_STRING变量的解析版本” [reference] ，其中Request.Url是（如你所说）原始URL封装在Uri对象中。根据{{3}}，Uri类'构造函数'...解析[url string]，将其置于规范格式，并进行任何必需的转义编码。“

因此，似乎Request.QueryString使用不同的函数来解析ServerVariables构造函数中的“QUERY_STRING”变量。这可以解释为什么你看到两者之间的差异。现在，为什么自定义解析函数使用不同的编码方法，而Uri对象的解析函数完全超出我的范围。也许有人对aspnet_isapi DLL有点精通可以为这个问题提供一些答案。

无论如何，希望我的帖子有道理。在旁注中，我想添加另一个参考资料，其中还提供了一些非常彻底和有趣的阅读：this article

Answer 2

根据标准，您表示为“已损坏”的编码字符串实际上是正确的编码。您指定为“正确”编码的那个使用规范的非标准扩展，以允许格式%uXXXX（我相信它应该表示UTF-16编码）。

在任何情况下，“已损坏”的编码字符串都可以。您可以使用以下代码来测试：

Uri uri = new Uri("http://www.example.com/test.aspx?search=heřmánek");
Console.WriteLine(uri.Query);
Console.WriteLine(HttpUtility.UrlDecode(uri.Query));

工作正常。但是......在预感中，我尝试使用指定的Latin-1代码页的UrlDecode，而不是默认的UTF-8：

Console.WriteLine(HttpUtility.UrlDecode(uri.Query, 
           Encoding.GetEncoding("iso-8859-1")));

......我得到了你指定的坏价值，'heÅmÃ¡nek'。换句话说，看起来调用HttpContext.RewritePath()会以某种方式更改urlencoding / decode以使用Latin-1代码页，而不是UTF-8，这是UrlEncode / Decode方法使用的默认编码。

如果你问我这看起来像个错误。您可以查看反射器中的RewritePath()代码，看看它是否正在使用查询字符串 - 将其传递给所有类型的虚拟路径函数，以及一些非托管的IIS代码。

我想知道在某个地方，Request对象核心的Uri是否被错误的代码页操纵？这可以解释为什么Request.Querystring（这只是来自HTTP头的原始值）是正确的，而使用错误编码的变量的Uri将是不正确的。

Answer 3

我在过去一天左右做了一些研究，我想我有一些相关信息。

当您使用Request.Querystring或HttpUtility.UrlDecode（或Encode）时，它使用web.config（或.config层次结构）的元素（特别是requestEncoding属性）中指定的Encoding（如果您还没有）指定）---不是Encoding.Default，它是服务器的默认编码。

如果将编码设置为UTF-8，则可以将单个unicode字符编码为2％xx十六进制值。当给定整个值时，它也将以这种方式解码。

如果您使用与编码的网址不同的编码进行UrlDecoding，您将获得不同的结果。

由于HttpUtility.UrlEncode和UrlDecode可以采用编码参数，因此很容易尝试使用ANSI代码页进行编码，但如果您拥有浏览器支持，则UTF-8是正确的方法（显然旧版本不支持UTF-8）。您只需要确保正确设置并且双方都能正常工作。

UTF-8似乎是默认编码:(来自.net反射器System.Web.HttpRequest）

internal Encoding QueryStringEncoding
{
    get
    {
        Encoding contentEncoding = this.ContentEncoding;
        if (!contentEncoding.Equals(Encoding.Unicode))
        {
            return contentEncoding;
        }
        return Encoding.UTF8;
    }
}

按照找出this.ContentEncoding的路径引导你（也在HttpRequest中）

public Encoding ContentEncoding
{
    get
    {
        if (!this._flags[0x20] || (this._encoding == null))
        {
            this._encoding = this.GetEncodingFromHeaders();
            if (this._encoding == null)
            {
                GlobalizationSection globalization = RuntimeConfig.GetLKGConfig(this._context).Globalization;
                this._encoding = globalization.RequestEncoding;
            }
            this._flags.Set(0x20);
        }
        return this._encoding;
    }
    set
    {
        this._encoding = value;
        this._flags.Set(0x20);
    }
}

回答您关于Request.Url.Quer和Request.QueryString之间差异的具体问题......这里是HttpRequest如何构建其Url属性：

public Uri Url
{
    get
    {
        if ((this._url == null) && (this._wr != null))
        {
            string queryStringText = this.QueryStringText;
            if (!string.IsNullOrEmpty(queryStringText))
            {
                queryStringText = "?" + HttpEncoder.CollapsePercentUFromStringInternal(queryStringText, this.QueryStringEncoding);
            }
            if (AppSettings.UseHostHeaderForRequestUrl)
            {
                string knownRequestHeader = this._wr.GetKnownRequestHeader(0x1c);
                try
                {
                    if (!string.IsNullOrEmpty(knownRequestHeader))
                    {
                        this._url = new Uri(this._wr.GetProtocol() + "://" + knownRequestHeader + this.Path + queryStringText);
                    }
                }
                catch (UriFormatException)
                {
                }
            }
            if (this._url == null)
            {
                string serverName = this._wr.GetServerName();
                if ((serverName.IndexOf(':') >= 0) && (serverName[0] != '['))
                {
                    serverName = "[" + serverName + "]";
                }
                this._url = new Uri(this._wr.GetProtocol() + "://" + serverName + ":" + this._wr.GetLocalPortAsString() + this.Path + queryStringText);
            }
        }
        return this._url;
    }
}

您可以看到它使用HttpEncoder类进行解码，但它使用相同的QueryStringEncoding值。

由于我已经在这里发布了很多代码，并且任何人都可以获得.NET Reflector，我将要完成其余部分。 QueryString属性来自HttpValueCollection，它使用FillFromEncodedBytes方法最终调用HttpUtility.UrlDecode（上面设置了QueryStringEncoding值），最终调用HttpEncoder对其进行解码。他们似乎使用不同的方法来解码查询字符串的实际字节，但他们用来做它的编码似乎是相同的。

有趣的是，HttpEncoder有如此多的功能，似乎做同样的事情，因此可能存在导致问题的方法存在差异。

Request.Url.Query和Request.QueryString有什么区别？

3 个答案: