Question

我从网站上抓取了一些数据。数据中名为urlresult的字符串为"http:\/\/www.cnopyright.com.cn\/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1"。

我想要做的是摆脱上面字符串 urlresult 中的前三个字符@'\'。我尝试过以下功能：

public string ConvertDataToUrl(string urlresult )
{

   var url= urlresult.Split('?')[0].Replace(@"\", "") + "?" + urlresult .Split('?')[1];


  return url

}

返回"http://www.cnopyright.com.cn/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\\u5317\\u4eac\\u6c83\\u534e\\u521b\\u65b0\\u79d1\\u6280\\u6709\\u9650\\u516c\\u53f8&softwareType=1"，这是不正确的。

正确的结果是"http://www.cnopyright.com.cn/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=北京沃华创新科技有限公司&softwareType=1"

我尝试了很多方法，但它没有用。我不知道如何得到正确的结果。

Answer 1

我认为您可能会被调试器误导，因为没有理由额外的＆＃34; \＆＃34;字符应该由您提供的代码插入。通常，调试器会显示额外的＆＃34; \＆＃34;在引用的字符串中，以便您可以分辨哪个＆＃34; \＆＃34;字符真的存在，而不是代表其他特殊字符。我建议用Debug.WriteLine写出字符串或将其放在日志文件中。我不认为您在问题中提供的信息是正确的。

作为证明，我编译并运行了这段代码：

static void Main(string[] args)
{
   var url = @"http:\/\/www.cnopyright.com.cn\/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1";
   Console.WriteLine("{0}{1}{2}", url, Environment.NewLine, 
      url.Split('?')[0].Replace(@"\", "") + "?" + url.Split('?')[1]);
}

输出结果为：

http:\/\/www.cnopyright.com.cn\/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1
http://www.cnopyright.com.cn/index.php?com=com_noticeQuery&method=wareList&optionid=1221&obligee=\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8&softwareType=1

Answer 2

您可以使用System.Text.RegularExpressions.Regex.Unescape方法：

var input = @"\u5317\u4eac\u6c83\u534e\u521b\u65b0\u79d1\u6280\u6709\u9650\u516c\u53f8";
string escapedText = System.Text.RegularExpressions.Regex.Unescape(input);

如何通过C＃

2 个答案: