我们在具有多个作业的独立火花群上耗尽内存。 在调查时我们发现了这些消息,并开始怀疑内存太少是免费的
16/09/23 12:30:38 INFO MemoryStore: Block broadcast_50802_piece0 stored as bytes in memory (estimated size 5.1 KB, free 233.5 KB)
16/09/23 12:30:38 INFO TorrentBroadcast: Reading broadcast variable 50802 took 9 ms
16/09/23 12:30:38 INFO MemoryStore: Block broadcast_50802 stored as values in memory (estimated size 11.3 KB, free 244.9 KB)
在另一个群集中,我们通常有免费报告为500MB +,并且堆栈溢出上的许多日志跟踪在GB中显示为免费。
在分析代码之后,这条消息似乎具有误导性。报告的可用内存实际上是blocksMemoryused
if (enoughMemory) {
// We acquired enough memory for the block, so go ahead and put it
val entry = new MemoryEntry(value(), size, deserialized)
entries.synchronized {
entries.put(blockId, entry)
}
val valuesOrBytes = if (deserialized) "values" else "bytes"
logInfo("Block %s stored as %s in memory (estimated size %s, free %s)".format(
blockId, valuesOrBytes, Utils.bytesToString(size), Utils.bytesToString(blocksMemoryUsed)))
} else {
// Tell the block manager that we couldn't put it in memory so that it can drop it to
// disk if the block allows disk storage.
lazy val data = if (deserialized) {
Left(value().asInstanceOf[Array[Any]])
} else {
Right(value().asInstanceOf[ByteBuffer].duplicate())
}
val droppedBlockStatus = blockManager.dropFromMemory(blockId, () => data)
droppedBlockStatus.foreach { status => droppedBlocks += ((blockId, status)) }
}
文档说明其使用的内存不是免费的
/**
* Amount of storage memory, in bytes, used for caching blocks.
* This does not include memory used for unrolling.
*/
private def blocksMemoryUsed: Long = memoryManager.synchronized {
memoryUsed - currentUnrollMemory
}
问题是为什么如果它实际使用的内存或我错误解释,这被称为免费。