基于xslt的网站进入100 cpu

时间:2014-12-25 05:57:41

标签: html asp.net regex xml xslt

我有一个用XSLT编写的网站 我们的想法是,数据将存储在XML文件中,并且网站将使用xsl模板将此XML转换为HTML。决定使用这种技术的开发人员已离开我们公司,没有人知道为什么以及如何做到这一点。

问题是有一天这个网站开始使用100%的CPU和服务器被绞死了 获取转储文件显示其中一个线程执行此操作:

Thread  26
Current frame: (MethodDesc 0x7a4b7f68 +0x4f System.Text.RegularExpressions.RegexInterpreter.Go())
ChildEBP RetAddr  Caller,Callee
11f3ecb4 7a5c20b1 (MethodDesc 0x7a45eab4 +0x91 System.Text.RegularExpressions.RegexRunner.Scan(System.Text.RegularExpressions.Regex, System.String, Int32, Int32, Int32, Int32, Boolean))
11f3eccc 7a5c1e97 (MethodDesc 0x7a45e99c +0x87 System.Text.RegularExpressions.Regex.Run(Boolean, Int32, System.String, Int32, Int32, Int32)), calling (MethodDesc 0x7a45eab4 +0 System.Text.RegularExpressions.RegexRunner.Scan(System.Text.RegularExpressions.Regex, System.String, Int32, Int32, Int32, Int32, Boolean))
11f3ed04 7a5c1dfd (MethodDesc 0x7a45e858 +0x2d System.Text.RegularExpressions.Regex.Match(System.String)), calling (MethodDesc 0x7a45e99c +0 System.Text.RegularExpressions.Regex.Run(Boolean, Int32, System.String, Int32, Int32, Int32))
11f3ed24 7a5c510c (MethodDesc 0x7a45e840 +0x2c System.Text.RegularExpressions.Regex.Match(System.String, System.String)), calling (MethodDesc 0x7a45e858 +0 System.Text.RegularExpressions.Regex.Match(System.String))
11f3ed38 667c1868 (MethodDesc 0x65fac47c +0x70 System.Web.UI.WebControls.RegularExpressionValidator.EvaluateIsValid()), calling (MethodDesc 0x7a45e840 +0 System.Text.RegularExpressions.Regex.Match(System.String, System.String))
11f3ed60 667acd0d (MethodDesc 0x65f60d84 +0x49 System.Web.UI.WebControls.BaseValidator.Validate())
11f3ed70 6669798e (MethodDesc 0x65f5b434 +0x8e System.Web.UI.Page.Validate()), calling 0289948e
11f3ed88 66903dbf (MethodDesc 0x65f5aeb4 System.Web.UI.Page.RaisePostBackEvent(System.Collections.Specialized.NameValueCollection))
11f3ed9c 660abb2e (MethodDesc 0x65f5b2c8 +0x61e System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)), calling (MethodDesc 0x65f5aeb4 +0 System.Web.UI.Page.RaisePostBackEvent(System.Collections.Specialized.NameValueCollection))
11f3eef0 660ab3b4 (MethodDesc 0x65f5b26c +0x84 System.Web.UI.Page.ProcessRequest(Boolean, Boolean)), calling (MethodDesc 0x65f5b2c8 +0 System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean))
11f3ef14 0f32271d (MethodDesc 0xf136b58 +0x3d System.Threading.Thread.get_CurrentUICulture()), calling mscorwks!CorExitProcess+0x1bb1c
11f3ef28 660ab2e1 (MethodDesc 0x65f5b260 +0x51 System.Web.UI.Page.ProcessRequest()), calling (MethodDesc 0x65f5b26c +0 System.Web.UI.Page.ProcessRequest(Boolean, Boolean))
11f3ef64 660ab276 (MethodDesc 0x65f5b23c +0x16 System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext)), calling (MethodDesc 0x65f5b260 +0 System.Web.UI.Page.ProcessRequest())
11f3ef70 660ab252 (MethodDesc 0x65f5b228 +0x32 System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)), calling (MethodDesc 0x65f5b23c +0 System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext))
11f3ef84 0fc19105 (MethodDesc 0x120a1888 +0x5 ASP.website_default_aspx.ProcessRequest(System.Web.HttpContext)), calling (MethodDesc 0x65f5b228 +0 System.Web.UI.Page.ProcessRequest(System.Web.HttpContext))
11f3ef88 030d4904 (MethodDesc 0x11f82ce0 +0x34 Xplode.Web.ApplicationRuntime.XplodePageHandler.ProcessRequest(System.Web.HttpContext)), calling 0289efda
11f3ef98 660b1726 (MethodDesc 0x65f6c088 +0xb6 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()), calling 0289efda
11f3ef9c 0fc19016 (MethodDesc 0x2cd9658 +0x56 FiftyOne.Foundation.Mobile.Detection.DetectorModule.SetPagePreIntClientTargets(System.Object, System.EventArgs)), calling mscorwks+0x9362
11f3efcc 6608445c (MethodDesc 0x65f67c8c +0x4c System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)), calling 0289d86a
11f3f008 6608fcd3 (MethodDesc 0x65fcb1ac +0x133 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)), calling (MethodDesc 0x65f67c8c +0 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef))
11f3f05c 660839dc (MethodDesc 0x65f67bac +0x7c System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object))
11f3f070 66086f4c (MethodDesc 0x65f6373c +0x17c System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)), calling 0289d7ca
11f3f09c 026e2c2c 026e2c2c, calling 0285a248
11f3f0ac 66086bf3 (MethodDesc 0x65fbb8a0 +0x63 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest)), calling (MethodDesc 0x65f6373c +0 System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest))
11f3f0bc 66085d8c (MethodDesc 0x65f643bc +0x11c System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)), calling (MethodDesc 0x65fbb8a0 +0 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest))
11f3f0d4 66085d01 (MethodDesc 0x65f643bc +0x91 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)), calling webengine!GetEcb
11f3f11c 79f23fcb mscorwks!InstallCustomModule+0x15733, calling mscorwks+0xcd0d
...

如您所见,页面为/website/default.aspx - 这是接受参数url并重定向到此网址的页面。所以我需要知道哪个页面成了这个。

然而,在这个网站上,只有少数几种形式的电子邮件正则表达式。

\w+(([-+.'’])*(\w+))*@\w+(([-.])*(\w+))*\.\w+([-.]\w+)*

其中一种形式是页脚"注册时事通讯",但当我删除此表单时没有任何改变。 然后我使用日志记录查看发送给验证器的内容,我看到很多来自机器人的垃圾邮件。

以下是一些显示页面呈现方式的代码:

<dandaraui:HomesForSale SaveParametersInSession="false"
 StyleSheet="~/assets/xslt/developments/development-homes-for-sale.xslt"   
 runat="server" ID="homesForSaleControl">
</dandaraui:HomesForSale>

HomesForSale.Render:

protected override void Render(HtmlTextWriter writer)
{
  if (PersistedDevelopment.Current == null)
    return;
  XPathDocument xpathDocument = PersistedDevelopment.Current.XPathDocument;
  XslCachedTransform xslCachedTransform = new XslCachedTransform(this.m_sStylesheet);
  HttpRequest request = HttpContext.Current.Request;
  for (int index = 0; index < request.QueryString.Keys.Count; ++index)
    this.m_pXsltArgumentList.AddParam(request.QueryString.Keys[index], string.Empty, (object) request.QueryString[index]);
  string str = xslCachedTransform.Transform(xpathDocument, this.m_pXsltArgumentList).Replace("&amp;#163;", "£");
  writer.Write(str);
}

1 个答案:

答案 0 :(得分:4)

你的正则表达式效率低,写得不好,导致所谓的Catastrophic Backtracking,其中有太多的选项,模式可以尝试匹配你的字符串。

有关详细信息,请参阅以下问题:

至于你的模式,我建议先阅读这些:

既然你可以做出明智的决定,你可以从这里选择一个模式:Regex library - email。他们都很穷,但没有你的模式那么糟糕。我建议使用[\w-]+@([\w-]+\.)+[\w-]+.*@.*\..*

如果你坚持保留与你的模式类似的东西,你可以通过应用unrolling-the-loop technique删除灾难性的回溯:

\w+(?:[-+.'’]\w*)*@\w+(?:[-.]\w*)*\.\w+

同样,这种模式没有多大意义,但它应该等同于你的模式,只是效率更高。