我正在尝试使用sizeEstimator在scala项目中找到我的case类对象的大小,但它会产生意想不到的结果。
import org.apache.spark.util.SizeEstimator
case class event(imei: String, date: String)
val check = event(imei, date)
println("size is event obj " + SizeEstimator.estimate(check))
println("size is single charct " + SizeEstimator.estimate("a"))
println("size is imei " + SizeEstimator.estimate(imei))
它输出为
size is event obj 520
size is single 48
size is imei 72
为什么这会变得疯狂?对于单个字符“a”,它应该是1个字节而我的imei是15个字符的字符串值,它也应该是15个字节。请给我任何建议。谢谢,
答案 0 :(得分:0)
scala> val char:java.lang.Character = 'a'
char: Character = a
scala> SizeEstimator.estimate(char)
res18: Long = 16
scala> SizeEstimator.estimate("A")
res19: Long = 48
如果你想拥有实际的Java堆大小,你必须使用Java类型专门声明它们,否则它只需要单引号就可以工作。
scala> SizeEstimator.estimate('A')
<console>:27: error: type mismatch;
found : Char('A')
required: AnyRef
Note: an implicit exists from scala.Char => java.lang.Character, but
methods inherited from Object are rendered ambiguous. This is to avoid
a blanket implicit which would convert any scala.Char to any AnyRef.
You may wish to use a type ascription: `x: java.lang.Character`.
SizeEstimator.estimate('A')
以下一般是字符串大小计算的公式 -
最小字符串内存使用量(字节)= 8 *(int)((((无字符)* 2)+ 45)/ 8)