我正在使用ColdFusion网关来解雇并忘记大量的操作。为此,我有一个循环,最后通过SendGatewayMessage()
查询。但是,我循环的查询可能会变得非常大。 (100.000+条记录)
为了防止操作丢失,我增加了队列大小和线程数。
由于操作仍然丢失,我在SendGatewayMessage()
之前包含一个循环,如此:
<cfloop condition="#gatewayService.getQueueSize()# GTE #gatewayService.getMaxQueueSize()#">
<cfset guardianCount = guardianCount+1>
</cfloop>
<cflog file="gatewayGuardian" text="#i# waited for #guardianCount# iterations. Queuesize:#gatewayService.getQueueSize()#">
<cfset SendGatewayMessage("EventGateway",eventData)>
(有关gatewayService类here的更多信息)
这或多或少是可以接受的,因为我可以将请求超时增加到几个小时(!),但我仍然在寻找一种更有效的方法来减慢向队列发送消息的速度,希望能够整个过程将更快,对服务器资源的压力更小。
有什么建议吗? 有关进一步增加队列大小的后果的任何想法吗?
答案 0 :(得分:1)
现在,我使用应用程序变量来跟踪整个作业中的记录,已处理的批次数以及处理的记录数。 在工作开始时,我有一段代码启动所有这些变量,如下所示:
<cfif not structKeyExists(application,"batchNumber") or application.batchNumber
eq 0 or application.batchNumber eq "">
<cfset application.batchNumber = 0>
<cfset application.recordsToDo = 0>
<cfset application.recordsDone = 0>
<cfset application.recordsDoneErrors = 0>
</cfif>
之后,我在查询中设置所有记录,并确定我们需要在当前批处理中处理该查询中的哪些记录。 批处理中的记录数量由记录总量和最大队列大小确定。这样,每个批次永远不会占用大约一半的队列。这可以确保作业永远不会干扰其他操作或作业,并且初始请求不会超时。
<cfset application.recordsToSync = qryRecords.recordcount>
<cfif not structKeyExists(application,"recordsPerBatch") or application.recordsPerBatch eq "" or application.recordsPerBatch eq 0>
<cfset application.recordsPerBatch = ceiling(application.recordsToDo/(ceiling(application.recordsToDo/gatewayService.getMaxQueueSize())+1))>
</cfif>
<cfset startRow = (application.recordsPerBatch*application.batchNumber)+1>
<cfset endRow = startRow + application.recordsPerBatch-1>
<cfif endRow gt application.recordsToDo>
<cfset endRow = application.recordsToDo>
</cfif>
然后我使用from / to循环遍历查询以触发网关事件。我保留了监护人,所以永远不会丢失记录因为队列已满。
<cfloop from="#startRow#" to="#endRow#" index="i">
<cfset guardianCount = 0>
<!--- load all values from the record into a struct --->
<cfset stRecordData = structNew()>
<cfloop list="#qryRecords.columnlist#" index="columnlabel">
<cfset stRecordData[columnlabel] = trim(qryRecords[columnlabel][i])>
</cfloop>
<cfset eventData = structNew()>
<cfset eventData.stData = stRecordData>
<cfset eventData.action = "bigJob">
<cfloop condition="#gatewayService.getQueueSize()# GTE #gatewayService.getMaxQueueSize()#">
<cfset guardianCount = guardianCount++>
</cfloop>
<cfset SendGatewayMessage("eventGateway",eventData)>
</cfloop>
每当记录完成时,我都会有一个函数来检查已完成的数量与要执行的记录数量。当它们相同时,我就完成了。否则我们可能需要开始一个新批次。 请注意,查看我们是否已完成的检查是在cflock中,但实际的事件发布不是。这是因为当您发布的事件无法读取锁中使用的变量时,您可能会遇到死锁。
我希望这对某人有用,或者其他人有更好的想法。
<cflock timeout="30" name="jobResult">
<cfset application.recordsDone++>
<cfif application.recordsDone eq application.recordsToDo>
<!--- We are done. Set all the application variables we used back to zero, so they do not get in the way when we start the job again --->
<cfset application.batchNumber = 0>
<cfset application.recordsToDo = 0>
<cfset application.recordsDone = 0>
<cfset application.recordsPerBatch = 0>
<cfset application.recordsDoneErrors = 0>
<cfset application.JobStarted = 0>
<!--- If the number of records we have done is the same as the number of records in a batch times the current batchnumber plus one, we are done with the batch. --->
<cfelseif application.recordsDone eq application.recordsPerBatch*(application.batchNumber+1)
and application.recordsDone neq application.recordsToDo>
<cfset application.batchNumber++>
<cfset doEventAnnounce = true>
</cfif>
</cflock>
<cfif doEventAnnounce>
<!--- Fire off the event that starts the job. All the info it needs is in the applicationscope. --->
<cfhttp url="#URURLHERE#/index.cfm" method="post">
<cfhttpparam type="url" name="event" value="startBigJob">
</cfhttp>
</cfif>