I'm using Google Guava Table to handle table structured data in a JAVA Application. My Data-Object consists of the Table and a Map wich stores the DataTypes for each column (int, string, decimal ...).
public class DataTable {
private Table<Integer, String, Object> data;
private Map<String, Integer> types;
private static int maxObjectSize;
private static int rowSize;
private DiskCache dc;
public DataTable(){
//Getter and Setter
This object can become very large and memory consuming (up to 10,000,000 rows and 16 GB memory). So my idea was to chache the to the temp-folder every 50,000 lines or so and read the data if needed.
public void putRow(int row, String column, Object value){
data.put(row, column, value);
rowSize = data.rowKeySet().size();
if(rowSize == maxObjectSize){
writeCache();
}
}
I habe big problems of chaching the Data. On one hand it's very time consuming to cache, on the other it is hard to enshure that no data is lost and i haven't found a good third party API to chache the data.
答案 0 :(得分:0)
对于数据,您可以缓存单个值或完成行。 要缓存单个值,请构造行和列的单个复合键对象。
Cache<CompoundKey, Object> cache = ...;
Object getValue(int row, String column) {
return cache.get(new CompoundKey(row, column));
}
或者,您可以通过将地图放在缓存中来缓存整行。
Cache<Integer, Map<String, Object> cache = ...;
Map<String, Object> getRow(int row) {
return cache.get(row);
}
你使用像EHCache这样的缓存,它支持将数据写入磁盘,如果它不适合堆。
你应该采取什么方法取决于:
要使缓存有用,它需要您(重新)生成特定行的缺失数据,并且您有一个访问模式,可以更频繁地请求某行中的某些行或值。如果您只能生成整个数据,或者只能在一次扫描中访问完整数据,那么小型数据库是一种有效的替代方案。查找mapdb或leveldb等内容。