在某些情况下,Azure连续webjob失败

时间:2017-11-15 15:01:32

标签: c# azure azure-webjobs continuous webjob

我有一个在azure上运行的连续webjob,在8小时前进行大规模部署后,在某些情况下状态永远不会完成,在其他情况下完成了这项工作。 我启用了所有可以找到的日志记录,并且花了很长时间试图找出问题所在。

我似乎能够找到的唯一日志错误信息来自job_log,其中指出:

  

[11/15/2017 14:46:23&gt; e553e5:错误]未处理的异常:Microsoft.WindowsAzure.Storage.StorageException:远程服务器返回错误:(404)Not Found。 ---&GT; System.Net.WebException:远程服务器返回错误:(404)Not Found。   [11/15/2017 14:46:23&gt; e553e5:ERR]在Microsoft.WindowsAzure.Storage.Shared.Protocol.HttpResponseParsers.ProcessExpectedStatusCodeNoException [T](HttpStatusCode expectedStatusCode,HttpStatusCode actualStatusCode,T retVal,StorageCommandBase 1 cmd, Exception ex) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\Common\Shared\Protocol\HttpResponseParsers.Common.cs:line 50 [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.WindowsAzure.Storage.Blob.CloudBlob.<DeleteBlobImpl>b__33(RESTCommand 1 cmd,HttpWebResponse resp,Exception ex,OperationContext ctx)in c:\ Program Files(x86)\ Jenkins \ workspace \ release_dotnet_master \ Lib \ ClassLibraryCommon \ Blob \ CloudBlob.cs:line 3349   [11/15/2017 14:46:23&gt; e553e5:ERR]在c:\ Program Files(x86)\ Jenkins \ workspace \ release_dotnet_master \ Lib \ ClassLibraryCommon \ Core \ Executor \ Executor中的Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndGetResponse [T](IAsyncResult getResponseResult) .cs:第299行   [11/15/2017 14:46:23&gt; e553e5:ERR] ---内部异常堆栈跟踪结束---   [11/15/2017 14:46:23&gt; e553e5:ERR]位于c:\ Program Files(x86)\ Jenkins \ workspace \ release_dotnet_master \ Lib \ ClassLibraryCommon \ Core \ Executor \ Executor中的Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync [T](IAsyncResult结果) .cs:第50行   [11/15/2017 14:46:23&gt; e553e5:ERR]位于c:\ Program Files(x86)\ Jenkins \ workspace \ release_dotnet_master \ Lib \ ClassLibraryCommon \ Blob \ CloudBlob.cs中的Microsoft.WindowsAzure.Storage.Blob.CloudBlob.EndDelete(IAsyncResult asyncResult):第1729行   [11/15/2017 14:46:23&gt; e553e5:ERR]在Microsoft.WindowsAzure.Storage.Core.Util.AsyncExtensions。&lt;&gt; c__DisplayClass4.b__3(IAsyncResult ar)在c:\ Program Files(x86)\ Jenkins \ workspace \ release_dotnet_master \ Lib \ ClassLibraryCommon \ Core \ Util \ AsyncExtensions.cs:第114行   [11/15/2017 14:46:23&gt; e553e5:ERR] ---从抛出异常的上一个位置开始的堆栈跟踪结束---   [11/15/2017 14:46:23&gt; System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(任务任务)中的e553e5:ERR]   [11/15/2017 14:46:23&gt; System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(任务任务)中的e553e5:ERR]   [11/15/2017 14:46:23&gt; e553e5:ERR]在Microsoft.Azure.WebJobs.Host.Protocols.PersistentQueueWriter 1.<DeleteAsync>d__6.MoveNext() [11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown --- [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Loggers.CompositeFunctionInstanceLogger.<DeleteLogFunctionStartedAsync>d__e.MoveNext() [11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown --- [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.<TryExecuteAsync>d__1.MoveNext() [11/15/2017 14:46:23 > e553e5: ERR ] --- End of stack trace from previous location where exception was thrown --- [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) [11/15/2017 14:46:23 > e553e5: ERR ] at Microsoft.Azure.WebJobs.Host.Executors.TriggeredFunctionExecutor 1.d__0.MoveNext()   [11/15/2017 14:46:23&gt; e553e5:ERR] ---从抛出异常的上一个位置开始的堆栈跟踪结束---   [11/15/2017 14:46:23&gt; e553e5:ERR]在Microsoft.Azure.WebJobs.Host.Timers.BackgroundExceptionDispatcher。&lt;&gt; c__DisplayClass1.b__0()   [11/15/2017 14:46:23&gt; System.Threading.ThreadHelper.ThreadStart_Context(对象状态)中的e553e5:ERR]   [11/15/2017 14:46:23&gt; System.Threading.ExecutionContext.RunInternal中的e553e5:ERR](ExecutionContext executionContext,ContextCallback回调,Object状态,Boolean preserveSyncCtx)   [11/15/2017 14:46:23&gt; System.Threading.ExecutionContext.Run上的e553e5:ERR](ExecutionContext executionContext,ContextCallback回调,Object状态,Boolean preserveSyncCtx)   [11/15/2017 14:46:23&gt; System.Threading.ExecutionContext.Run中的e553e5:ERR](ExecutionContext executionContext,ContextCallback回调,对象状态)   [11/15/2017 14:46:23&gt; e553e5:ERR]在System.Threading.ThreadHelper.ThreadStart()

任何人都可以给我一些关于如何调试它的想法,因为我没有想法。

我的webjobs主要看起来像这样:

 static void Main()
    {
         var host = new JobHost();

        var config = new JobHostConfiguration();
        config.Queues.MaxPollingInterval = new TimeSpan(0,0,0,30);
        config.Queues.MaxDequeueCount = 3;
        // The following code ensures that the WebJob will be running continuously
        host.RunAndBlock();
    }

并且processqueuemessage看起来像:

 public static void ProcessQueueMessage([QueueTrigger("importqueue")] string msg)
    {
        try
        {
            WorkerWebJobCore wwjc = new WorkerWebJobCore();
            wwjc.RunCore(msg, TableStorageAccessResources.ImportQueue,
                TableStorageAccessResources.TableStorageDataOneId,
                TableStorageAccessResources.TableStorageDataOnePassword);
        }
        catch (Exception e)
        {
            CommunicatorLog.Log.LogError("WebJobWorker","WebJobWorker","Error in processing queue message","ERRWJWF01");
        }
    }

所以我抓住了一切,所以我不知道它怎么会失败?

提前致谢。

2 个答案:

答案 0 :(得分:0)

显然运行Microsoft.Azure.Webjob 2.0.0以下的版本使得无法获得有用的答案。 当我最终尝试安装该版本时,它向我指出了有用的错误消息的问题。

问题与dj的错误版本有关,关于webjob核心的工作原理

答案 1 :(得分:0)

我的猜测是,某些东西要么弄乱你的队列,要么搞乱存储本身的文件。

看起来它试图删除不再存在的文件。或者可能会删除“更大”的内容。

在深入研究它时,它可能也是您部署WebJob的一个问题。也许在部署时有时会有区别?看看这些:

Azure Web Job-The remote server returned 404

https://github.com/Azure/azure-webjobs-sdk/issues/922

Azure WebJob QueueTrigger message is not deleted from queue

https://github.com/Azure/azure-webjobs-sdk/issues/645