HBase:我如何估计HBase表的大小?

时间:2016-11-15 11:41:42

标签: java hadoop hbase

我有多个HBase表,如何在java中估算表的大致大小?

1 个答案:

答案 0 :(得分:3)

一种方法是使用java客户端通常在/hbase文件夹下访问hdfs 所有表信息。将出席。

Hadoop shell:

您可以使用hadoop fs -du -h **path to hbase**/hbase

进行检查

在/ hbase下,每个表占用一个文件夹...

hadoop fs -ls -R **path to hbase**/hbase

hadoop fs -du -h **path to hbase**/hbase/tablename

Java HDFS客户端:

同样的事情你可以通过传递hbase root dir下的每个表路径来使用java hdfs客户端,如下所示... 查看getSizeOfPaths& getSizeOfDirectory方法

public class HdfsUtil {
    /**
     * Estimates the number of splits by taking the size of the paths and dividing by the splitSize.
     *
     * @param paths
     * @param configuration
     * @param splitSize
     * @return
     * @throws IOException
     */
    public static long getNumOfSplitsForInputs(Path[] paths, Configuration configuration, long splitSize) throws IOException
    {
        long size = getSizeOfPaths(paths, configuration);
        long splits = (int) Math.ceil( size / (splitSize)) ;
        return splits;
    }

    public static long getSizeOfPaths(Path[] paths, Configuration configuration) throws IOException
    {
        long totalSize = 0L;

        for(Path path: paths)
        {
           totalSize += getSizeOfDirectory(path, configuration);
        }
        return totalSize;
    }
// here you can give hbase path folder which was described through shell
        public static long getSizeOfDirectory(Path path, Configuration configuration) throws IOException {
            //Get the file size of the unannotated Edges
            FileSystem fileSystem = FileSystem.get(configuration);
            long size  = fileSystem.getContentSummary(path).getLength();
/**static String    byteCountToDisplaySize(BigInteger size)
Returns a human-readable version of the file size, where the input represents a specific number of bytes.**/
System.out.println(FileUtils.byteCountToDisplaySize(size))
            return size;
        }
    }