下午好,
在过去两天中,Java服务器出现了许多连接问题。这有点不常见,因为错误并非总是发生,有时只是......
我正在使用PySpark与Jupyter Notebook结合使用。一切都在Google Cloud中的VM实例上运行。我在Google Cloud中使用这个:
custom (8 vCPUs, 200 GB)
以下是其他设置:
conf = pyspark.SparkConf().setAppName("App")
conf = (conf.setMaster('local[*]')
.set('spark.executor.memory', '180G')
.set('spark.driver.memory', '180G')
.set('spark.driver.maxResultSize', '180G'))
sc = pyspark.SparkContext(conf=conf)
sq = pyspark.sql.SQLContext(sc)
我训练了随机森林模型并进行了预测:
model = rf.fit(train)
predictions = model.transform(test)
之后我创建了ROC-Curve并计算了AUC值。
然后我想看到混淆矩阵:
confusion_mat = metrics.confusionMatrix().toArray()
print(confusion_mat_train_rf)
现在出现错误:
Traceback (most recent call last):
File "/usr/lib/python2.7/SocketServer.py", line 290, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 318, in process_request
self.finish_request(request, client_address)
File "/usr/lib/python2.7/SocketServer.py", line 331, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python2.7/SocketServer.py", line 652, in __init__
self.handle()
File "/usr/local/lib/python2.7/dist-packages/pyspark/accumulators.py", line 235, in handle
num_updates = read_int(self.rfile)
File "/usr/local/lib/python2.7/dist-packages/pyspark/serializers.py", line 577, in read_int
raise EOFError
EOFError
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/py4j/java_gateway.py", line 883, in send_command
response = connection.send_command(command)
File "/usr/local/lib/python2.7/dist-packages/py4j/java_gateway.py", line 1040, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
Py4JNetworkError: Error while receiving
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:39543)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/py4j/java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
File "/usr/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
以下是控制台的输出:
OpenJDK 64-Bit Server VM warning
: INFO: os::commit_memory(0x00007f4998300000, 603979776, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 603979776 bytes for committing reserved memory.
日志文件:
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 603979776 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux.cpp:2643), pid=2377, tid=0x00007f1c94fac700
#
# JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 )
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
--------------- S Y S T E M ---------------
OS:DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.3 LTS"
uname:Linux 4.13.0-1008-gcp #11-Ubuntu SMP Thu Jan 25 11:08:44 UTC 2018 x86_64
libc:glibc 2.23 NPTL 2.23
rlimit: STACK 8192k, CORE 0k, NPROC 805983, NOFILE 1048576, AS infinity
load average:7.69 4.51 3.57
/proc/meminfo:
MemTotal: 206348252 kB
MemFree: 1298460 kB
MemAvailable: 250308 kB
Buffers: 6812 kB
Cached: 438232 kB
SwapCached: 0 kB
Active: 203906416 kB
Inactive: 339540 kB
Active(anon): 203804300 kB
Inactive(anon): 8392 kB
Active(file): 102116 kB
Inactive(file): 331148 kB
Unevictable: 3652 kB
Mlocked: 3652 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 4688 kB
Writeback: 0 kB
AnonPages: 203805168 kB
Mapped: 23076 kB
Shmem: 8776 kB
Slab: 114476 kB
SReclaimable: 50640 kB
SUnreclaim: 63836 kB
KernelStack: 4752 kB
PageTables: 404292 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 103174124 kB
Committed_AS: 205956256 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 71628 kB
DirectMap2M: 4122624 kB
DirectMap1G: 207618048 kB
CPU:total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 85 stepping 3, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx
有谁知道问题可能是什么以及我如何解决这个问题?我很绝望。 :(
//我认为Java Runtime Environment没有足够的内存来继续...但我该怎么办?
非常感谢!
答案 0 :(得分:1)
如果你
在Google Cloud中使用此功能:
自定义(8个vCPU,200 GB)
然后你明显超额认购记忆。忽略spark.executor.memory
在local
模式下无效。
spark.executor.memory
只考虑JVM堆,但不包括:
即使使用JVM,其中只有一部分可用于数据处理(请参阅Memory Management Overview),因此等于总分配内存的spark.driver.maxResultSize
没有意义。