从昨天起,这引起了我的兴趣和关注。我试图在Java中存储位并通过Memory Overhead命中。
我的第一个问题是What is size of my Bitset?
根据答案,我查看了其他参考资料,并找到了Memory Usage指南。
然后我查看了BitSet
看起来像
public class BitSet implements Cloneable, java.io.Serializable {
/*
* BitSets are packed into arrays of "words." Currently a word is
* a long, which consists of 64 bits, requiring 6 address bits.
* The choice of word size is determined purely by performance concerns.
*/
private final static int ADDRESS_BITS_PER_WORD = 6;
private final static int BITS_PER_WORD = 1 << ADDRESS_BITS_PER_WORD;
private final static int BIT_INDEX_MASK = BITS_PER_WORD - 1;
/* Used to shift left or right for a partial word mask */
private static final long WORD_MASK = 0xffffffffffffffffL;
/**
* @serialField bits long[]
*
* The bits in this BitSet. The ith bit is stored in bits[i/64] at
* bit position i % 64 (where bit position 0 refers to the least
* significant bit and 63 refers to the most significant bit).
*/
private static final ObjectStreamField[] serialPersistentFields = {
new ObjectStreamField("bits", long[].class),
};
/**
* The internal field corresponding to the serialField "bits".
*/
private long[] words;
/**
* The number of words in the logical size of this BitSet.
*/
private transient int wordsInUse = 0;
/**
* Whether the size of "words" is user-specified. If so, we assume
* the user knows what he's doing and try harder to preserve it.
*/
private transient boolean sizeIsSticky = false;
/* use serialVersionUID from JDK 1.0.2 for interoperability */
private static final long serialVersionUID = 7997698588986878753L;
/**
* Given a bit index, return word index containing it.
*/
private static int wordIndex(int bitIndex) {
return bitIndex >> ADDRESS_BITS_PER_WORD;
}
.....
}
根据基于Memory Guide
的计算,这是我计算的
8 Bytes: housekeeping space
12 Bytes: 3 ints
8 Bytes: long
12 Bytes: long[]
4 Bytes: transient int // does it count?
1 Byte : transient boolean
3 Bytes: padding
此总和为45 + 3 bytes (padding to reach multiple of 8)
这意味着空BitSet
本身保留48 bytes
。
但我的要求是存储位,我缺少什么?我有什么选择?
非常感谢
更新
我的要求是,我希望将64 bits
的总数存储在两个单独的字段中
class MyClass{
BitSet timeStamp
BitSet id
}
我希望在内存中存储数百万个MyClass
个对象
答案 0 :(得分:4)
我的要求是我希望将总共64位存储为两位 单独的字段
所以只需使用long(64位整数)。并将其用作一个位域。我曾经需要类似的东西,但32位对我来说已经足够了,所以写了一个小库类来使用int作为位集: https://github.com/claudemartin/smallset
随意分叉,只需用长,32乘64,1乘1L等替换int。
答案 1 :(得分:3)
这总计为45 + 3个字节(填充达到8的倍数)这意味着 空BitSet本身保留48个字节。
首先,我想建议正确的工具来分析JVM中的对象布局方案 - JOL。在您的情况下(java -jar jol-cli/target/jol-cli.jar internals java.util.BitSet
)JOL产生以下结果:
Running 64-bit HotSpot VM.
Using compressed references with 3-bit shift.
Objects are 8 bytes aligned.
Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
java.util.BitSet object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) f4 df 9f e0 (11110100 11011111 10011111 11100000) (-526393356)
12 4 int BitSet.wordsInUse 0
16 1 boolean BitSet.sizeIsSticky false
17 3 (alignment/padding gap) N/A
20 4 long[] BitSet.words [0]
Instance size: 24 bytes (reported by Instrumentation API)
Space losses: 3 bytes internal + 0 bytes external = 3 bytes total
由于静态字段,您的计算不正确,因此空BitSet
本身保留24个字节。请注意,这些计算并非100%准确,因为未将long[]
对象的大小考虑在内。所以正确的结果是java -jar jol-cli/target/jol-cli.jar externals java.util.BitSet
:
Running 64-bit HotSpot VM.
Using compressed references with 3-bit shift.
Objects are 8 bytes aligned.
Field sizes by type: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
Array element sizes: 4, 1, 1, 2, 2, 4, 4, 8, 8 [bytes]
java.util.BitSet@6b25f76bd object externals:
ADDRESS SIZE TYPE PATH VALUE
7ae321a48 24 java.util.BitSet (object)
7ae321a60 24 [J .words [0]
这意味着空BitSet本身使用48个字节,包括长数组。
为了优化内存占用,您可以编写自己的BitSet
实现。例如,在您的用例中,可以使用以下选项:
public class MyOwnBitSet {
long word1;
long word2;
}
public class MyOwnBitSet2 {
long[] word = new long[2];
}
public class MyOwnBitSet3 {
int index;
}
JOL产生以下结果:
MyOwnBitSet@443b7951d object externals:
ADDRESS SIZE TYPE PATH VALUE
76ea4c7f8 32 MyOwnBitSet (object)
MyOwnBitSet2@69663380d object externals:
ADDRESS SIZE TYPE PATH VALUE
76ea53800 16 MyOwnBitSet2 (object)
76ea53810 32 [J .word [0, 0]
MyOwnBitSet3@5a2e4553d object externals:
ADDRESS SIZE TYPE PATH VALUE
76ea5c070 16 MyOwnBitSet3 (object)
让我解释一下最后一个例子MyOwnBitSet3
。为了减少内存占用,您可以预先分配大量long
/ int
个对象,并仅将指针存储在右侧单元格上。对于足够多的对象,此选项是最有利的。
答案 2 :(得分:0)
要在对象中存储总共64位
class MyClass{
int timeStamp
int id
}
或者如果你不想要对象的开销,你可以做
long timeStampAndId;
问题是如何封装您的操作。对于原始人。 Java没有多大帮助,但你可以做的是
enum TimeStampAndId {
/* no instances */ ;
public static boolean isTimeStampSet(long timeStampAndId, int n) { ... }
public static boolean isIdSet(long timeStampAndId, int n) { ... }
即。使用实用程序类来支持基元类型。
将来Java将支持不会产生对象开销的值类型。