App Engine Map减少控制器回调的异常

时间:2012-12-02 12:15:08

标签: java google-app-engine mapreduce

在我们的应用引擎Java实例上运行Map Reduce作业时,它会在一段时间后停止,并且我得到一个异常,即请求太大。从异常中不清楚如何解决这个问题,因为在洗牌阶段发生这种情况应该是无缝的。

我理解实体大小,超时等方面的限制,但我不确定在API中指明这一点?

我使用的代码与代码here非常相似,它只计算不同的属性属性,并使servlet几乎相同。

这就是我得到的:

2012-12-02 13:18:53.632 /mapreduce/controllerCallback 500 535ms 0kb AppEngine-Google; (+http://code.google.com/appengine)
0.1.0.2 - - [02/Dec/2012:03:18:53 -0800] "POST /mapreduce/controllerCallback HTTP/1.1" 500 0 "http://xxx/mapreduce/controllerCallback" "AppEngine-Google; (+http://code.google.com/appengine)" "xxx.com" ms=535 cpu_ms=288 queue_name=default task_name=16764257756372630651 instance=00c61b117cc2653091fefc2f9b795b790f1a
I 2012-12-02 13:18:53.203
com.google.appengine.tools.mapreduce.impl.shardedjob.ShardedJobRunner pollTaskStates: Polling task states for job 002b0ae1-643b-420d-b3cd-20f03cc054ff-reduce, sequence number 166
W 2012-12-02 13:18:53.630
/mapreduce/controllerCallback
com.google.apphosting.api.ApiProxy$RequestTooLargeException: The request to API call datastore_v3.Put() was too large.
    at com.google.apphosting.runtime.ApiProxyImpl$AsyncApiFuture.success(ApiProxyImpl.java:494)
    at com.google.apphosting.runtime.ApiProxyImpl$AsyncApiFuture.success(ApiProxyImpl.java:392)
    at com.google.net.rpc3.client.RpcStub$RpcCallbackDispatcher$1.runInContext(RpcStub.java:781)
    at com.google.tracing.TraceContext$TraceContextRunnable$1.run(TraceContext.java:461)
    at com.google.tracing.TraceContext.runInContext(TraceContext.java:703)
    at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContextNoUnref(TraceContext.java:338)
    at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContext(TraceContext.java:330)
    at com.google.tracing.TraceContext$TraceContextRunnable.run(TraceContext.java:458)
    at com.google.net.rpc3.client.RpcStub$RpcCallbackDispatcher.rpcFinished(RpcStub.java:823)
    at com.google.net.rpc3.client.RpcStub$RpcCallbackDispatcher.success(RpcStub.java:808)
    at com.google.net.rpc3.impl.client.RpcClientInternalContext.runCallbacks(RpcClientInternalContext.java:898)
    at com.google.net.rpc3.impl.client.RpcClientInternalContext.finishRpcAndNotifyApp(RpcClientInternalContext.java:803)
    at com.google.net.rpc3.impl.client.RpcNetChannel.afterFinishingActiveRpc(RpcNetChannel.java:1063)
    at com.google.net.rpc3.impl.client.RpcNetChannel.finishRpc(RpcNetChannel.java:911)
    at com.google.net.rpc3.impl.client.RpcNetChannel.handleResponse(RpcNetChannel.java:2267)
    at com.google.net.rpc3.impl.client.RpcNetChannel.messageReceived(RpcNetChannel.java:2068)
    at com.google.net.rpc3.impl.client.RpcNetChannel.access$2000(RpcNetChannel.java:143)
    at com.google.net.rpc3.impl.client.RpcNetChannel$TransportCallback.receivedMessage(RpcNetChannel.java:3129)
    at com.google.net.rpc3.impl.client.RpcChannelTransportData$TransportCallback.receivedMessage(RpcChannelTransportData.java:599)
    at com.google.net.rpc3.impl.wire.RpcBaseTransport.receivedMessage(RpcBaseTransport.java:417)
    at com.google.apphosting.runtime.udrpc.UdrpcTransport$ClientAdapter.receivedMessage(UdrpcTransport.java:424)
    at com.google.apphosting.runtime.udrpc.UdrpcTransport.dispatchPacket(UdrpcTransport.java:265)
    at com.google.apphosting.runtime.udrpc.UdrpcTransport.readPackets(UdrpcTransport.java:217)
    at com.google.apphosting.runtime.udrpc.UdrpcTransport$1.run(UdrpcTransport.java:81)
    at com.google.net.eventmanager.AbstractFutureTask$Sync.innerRun(AbstractFutureTask.java:260)
    at com.google.net.eventmanager.AbstractFutureTask.run(AbstractFutureTask.java:121)
    at com.google.net.eventmanager.EventManagerImpl.runTask(EventManagerImpl.java:578)
    at com.google.net.eventmanager.EventManagerImpl.internalRunWorkerLoop(EventManagerImpl.java:1002)
    at com.google.net.eventmanager.EventManagerImpl.runWorkerLoop(EventManagerImpl.java:884)
    at com.google.net.eventmanager.WorkerThreadInfo.runWorkerLoop(WorkerThreadInfo.java:136)
    at com.google.net.eventmanager.EventManagerImpl$WorkerThread.run(EventManagerImpl.java:1855)
C 2012-12-02 13:18:53.631
Uncaught exception from servlet
com.google.apphosting.api.ApiProxy$RequestTooLargeException: The request to API call datastore_v3.Put() was too large.
    at com.google.apphosting.runtime.ApiProxyImpl$AsyncApiFuture.success(ApiProxyImpl.java:494)
    at com.google.apphosting.runtime.ApiProxyImpl$AsyncApiFuture.success(ApiProxyImpl.java:392)
    at com.google.net.rpc3.client.RpcStub$RpcCallbackDispatcher$1.runInContext(RpcStub.java:781)
    at com.google.tracing.TraceContext$TraceContextRunnable$1.run(TraceContext.java:461)
    at com.google.tracing.TraceContext.runInContext(TraceContext.java:703)
    at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContextNoUnref(TraceContext.java:338)
    at com.google.tracing.TraceContext$AbstractTraceContextCallback.runInInheritedContext(TraceContext.java:330)
    at com.google.tracing.TraceContext$TraceContextRunnable.run(TraceContext.java:458)
    at com.google.net.rpc3.client.RpcStub$RpcCallbackDispatcher.rpcFinished(RpcStub.java:823)
    at com.google.net.rpc3.client.RpcStub$RpcCallbackDispatcher.success(RpcStub.java:808)
    at com.google.net.rpc3.impl.client.RpcClientInternalContext.runCallbacks(RpcClientInternalContext.java:898)
    at com.google.net.rpc3.impl.client.RpcClientInternalContext.finishRpcAndNotifyApp(RpcClientInternalContext.java:803)
    at com.google.net.rpc3.impl.client.RpcNetChannel.afterFinishingActiveRpc(RpcNetChannel.java:1063)
    at com.google.net.rpc3.impl.client.RpcNetChannel.finishRpc(RpcNetChannel.java:911)
    at com.google.net.rpc3.impl.client.RpcNetChannel.handleResponse(RpcNetChannel.java:2267)
    at com.google.net.rpc3.impl.client.RpcNetChannel.messageReceived(RpcNetChannel.java:2068)
    at com.google.net.rpc3.impl.client.RpcNetChannel.access$2000(RpcNetChannel.java:143)
    at com.google.net.rpc3.impl.client.RpcNetChannel$TransportCallback.receivedMessage(RpcNetChannel.java:3129)
    at com.google.net.rpc3.impl.client.RpcChannelTransportData$TransportCallback.receivedMessage(RpcChannelTransportData.java:599)
    at com.google.net.rpc3.impl.wire.RpcBaseTransport.receivedMessage(RpcBaseTransport.java:417)
    at com.google.apphosting.runtime.udrpc.UdrpcTransport$ClientAdapter.receivedMessage(UdrpcTransport.java:424)
    at com.google.apphosting.runtime.udrpc.UdrpcTransport.dispatchPacket(UdrpcTransport.java:265)
    at com.google.apphosting.runtime.udrpc.UdrpcTransport.readPackets(UdrpcTransport.java:217)
    at com.google.apphosting.runtime.udrpc.UdrpcTransport$1.run(UdrpcTransport.java:81)
    at com.google.net.eventmanager.AbstractFutureTask$Sync.innerRun(AbstractFutureTask.java:260)
    at com.google.net.eventmanager.AbstractFutureTask.run(AbstractFutureTask.java:121)
    at com.google.net.eventmanager.EventManagerImpl.runTask(EventManagerImpl.java:578)
    at com.google.net.eventmanager.EventManagerImpl.internalRunWorkerLoop(EventManagerImpl.java:1002)
    at com.google.net.eventmanager.EventManagerImpl.runWorkerLoop(EventManagerImpl.java:884)
    at com.google.net.eventmanager.WorkerThreadInfo.runWorkerLoop(WorkerThreadInfo.java:136)
    at com.google.net.eventmanager.EventManagerImpl$WorkerThread.run(EventManagerImpl.java:1855)

我发现this但是我没有使用任何实体,我猜测正在创建实体来存储数据但是因为我不确定在哪里定义我应该写多少呢?

3 个答案:

答案 0 :(得分:4)

运行Map Reduce时遇到同样的异常。 问题是Map Reduce包含大量元数据,这些元数据是在作业期间生成的,之后数据被保存为数据存储区中的单个实体。 当实体变得大于允许值时,抛出异常并停止Map Reduce。

我的问题是我写了很多日志,比如

context.getCounter("some log message", "value");

在Map Reduce的正文中。 解决方案是减少日志大小和数量。

答案 1 :(得分:2)

一次调用中可以写入的最大实体数为500,目前实体的大小最多为1 MB。这在这里描述:

Quotas_and_Limits

因此,您尝试在实体中存储过多数据,因此抛出此异常。 以下是您可以尝试的一些解决方法:

  • 将您的数据拆分为多个小于1 MB块的实体(不知道具体用例的具体用法)
  • S3用于特别大的文件。

答案 2 :(得分:1)

我最近遇到了同样的问题,并将其追溯到我的Mapper类。该类的成员数据被序列化为/形成切片之间的数据存储(我认为),并且我有大量的数据,超过1MB。它与我班级正在进行的对数据存储区的任何显式调用无关。