我遇到与此处JVM Monitor char array memory usage相同的问题。但是我没有从这个问题得到明确的答案,因为名声不好而无法添加评论。所以,我在这里问。
我编写了一个多线程程序来计算单词共现频率。我正懒散地从文件中读取文字并进行计算。在程序中,我有一个地图,其中包含单词对及其共现计数。完成计数操作后,我将此地图写入文件。
将频率图写入文件后。文件的大小例如是3GB。但是当程序运行时,使用的内存是35gb ram + 5gb交换区域。然后我监视jvm,内存图片是这样的:和垃圾收集器图片是这样的:和参数overwiew: 当输出文件大小为3gb时,char []数组如何占用这么多内存?感谢。
此代码不是多线程的,用于合并包含共同出现的单词及其计数的两个文件。并且此代码也会导致相同的内存使用问题,而且由于堆空间使用率过高,此代码会导致大量gc调用,因此正常程序无法运行,因为停止了垃圾收集器:
import java.io.{BufferedWriter, File, FileWriter, FilenameFilter}
import java.util.regex.Pattern
import core.WordTuple
import scala.collection.mutable.{Map => mMap}
import scala.io.{BufferedSource, Source}
class PairWordsMerger(path: String, regex: String) {
private val wordsAndCounts: mMap[WordTuple, Int] = mMap[WordTuple, Int]()
private val pattern: Pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE)
private val dir: File = new File(path)
private var sWordAndCount: Array[String] = Array.fill(3)("")
private var tempTuple: WordTuple = WordTuple("","")
private val matchedFiles: Array[File] = dir.listFiles(new FilenameFilter {
override def accept(dir: File, name: String): Boolean = pattern.matcher(name).matches()
})
def merge(): Unit = {
for(fileName <- matchedFiles) {
val file: BufferedSource = Source.fromFile(fileName)
val iter: Iterator[String] = file.getLines()
while(iter.hasNext) {
//here I used split like this because entries in the file
//are hold in this format: word1,word2,frequency
sWordAndCount = iter.next().split(",")
tempTuple = WordTuple(sWordAndCount(0), sWordAndCount(1))
try {
wordsAndCounts += (tempTuple -> (wordsAndCounts.getOrElse(tempTuple, 0) + sWordAndCount(2).toInt))
} catch {
case e: NumberFormatException => println("Cannot parse to int...")
}
}
file.close()
println("One pair words map update done")
}
writeToFile()
}
private def writeToFile(): Unit = {
val f: File = new File("allPairWords.txt")
val out = new BufferedWriter(new FileWriter(f))
for(elem <- wordsAndCounts) {
out.write(elem._1 + "," + elem._2 + "\n")
}
out.close()
}
}
object PairWordsMerger {
def apply(path: String, regex: String): PairWordsMerger = new PairWordsMerger(path, regex)
}