如何在没有dontEscape参数的情况下创建Uri

时间:2016-01-29 08:13:25

标签: c# .net httpwebrequest

当我使用HttpWebRequest下载页面时,我遇到了一个问题:

在创建新的Uri之前,我将转义url,并将其传递给Uri构造函数。但是当我使用HttpWebRequest下载页面时,它会转换引用字符。奇怪。

一部开拓创新:

https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs

Escaped,并传递给Uri构造函数:

https://fr.wikipedia.org/wiki/Roi_Julian_!_L%27%C3%89lu_des_l%C3%A9murs

HttpWebRequest发送到服务器:

https://fr.wikipedia.org/wiki/Roi_Julian_!_L'%C3%89lu_des_l%C3%A9murs

以下是我的测试代码:

private static void Test()
{
    var title = "Roi_Julian_!_L'Élu_des_lémurs";
    var url = "https://fr.wikipedia.org/wiki/" + Uri.EscapeDataString(title);
    var uri = new Uri(url);

    HttpWebDownload(uri);
}

private static void HttpWebDownload(Uri uri)
{
    WebResponse response = null;
    StreamReader reader = null;

    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
    request.Method = "GET";
    request.AllowAutoRedirect = false;
    response = request.GetResponse();
    reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
    string pageResponse = reader.ReadToEnd();

    Console.WriteLine(pageResponse);
}

这是system.net的跟踪日志:

System.Net Verbose: 0 : [10640] WebRequest::Create(https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs)
System.Net Verbose: 0 : [10640] HttpWebRequest#33111870::HttpWebRequest(https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs#-554901600)
System.Net Information: 0 : [10640] Current OS installation type is 'Server'.
System.Net Information: 0 : [10640] RAS supported: True
System.Net Verbose: 0 : [10640] Exiting HttpWebRequest#33111870::HttpWebRequest() 
System.Net Verbose: 0 : [10640] Exiting WebRequest::Create()    -> HttpWebRequest#33111870
System.Net Verbose: 0 : [10640] HttpWebRequest#33111870::GetResponse()
System.Net Error: 0 : [10640] Can't retrieve proxy settings for Uri 'https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs'. Error code: 12180.
System.Net Verbose: 0 : [10640] ServicePoint#66337667::ServicePoint(fr.wikipedia.org:443)
System.Net Information: 0 : [10640] Associating HttpWebRequest#33111870 with ServicePoint#66337667
System.Net Information: 0 : [10640] Associating Connection#35489797 with HttpWebRequest#33111870
System.Net Information: 0 : [10640] Connection#35489797 - Created connection from 10.168.184.78:55975 to 198.35.26.96:443.
System.Net Information: 0 : [10640] TlsStream#45795543::.ctor(host=fr.wikipedia.org, #certs=0)
System.Net Information: 0 : [10640] Associating HttpWebRequest#33111870 with ConnectStream#65677972
System.Net Information: 0 : [10640] HttpWebRequest#33111870 - Request: GET /wiki/Roi_Julian_!_L'%C3%89lu_des_l%C3%A9murs HTTP/1.1

System.Net Information: 0 : [10640] ConnectStream#65677972 - Sending headers
{
Host: fr.wikipedia.org
Connection: Keep-Alive
}.

我认为这是由dontEscape参数引起的,因此,我添加了一个新函数来修复它,但是,我失败了。

private const ulong UserEscape = 0x00080000;           
public static void EnableUserEscape(Uri uri)
{
    FieldInfo fieldInfo = uri.GetType().GetField("m_Flags", BindingFlags.Instance | BindingFlags.NonPublic);
    if (fieldInfo == null)
    {
        throw new MissingFieldException("'m_Flags' field not found");
    }
    var uriFlags = (ulong)fieldInfo.GetValue(uri);
    uriFlags = uriFlags | UserEscape;
    fieldInfo.SetValue(uri, uriFlags);
 }

在将Uri传递给HttpWebDownload()之前,我使用此函数启用UserEscape,但最后HttpWebRequest将此类URL(https://fr.wikipedia.org/wiki/Roi_Julian_!_L'?lu_des_l?murs)发送到服务器。

任何人都可以提供解决方案吗?

由于

1 个答案:

答案 0 :(得分:0)

  

在你的测试方法中定义uri如下。它可以帮到你吗

     

带有@" somestring + SpecialCharacter"的字符串被称为逐字字符串。它基本上意味着,不要对字符串中的特殊字符应用任何解释,直到达到下一个引号字符"

var url = @"https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs";