Cassandra缺乏可扩展性

时间:2016-09-10 10:41:32

标签: cassandra scalability

我遇到了可扩展性Cassandra数据库的问题。尽管节点数量从2增加到8,但数据库的性能不会增长。

Cassandra Version: 3.7
Cassandra Hardware x8: 1vCPU 2.5 Ghz, 900 MB RAM, SSD DISK 20GB, 10 Gbps LAN
Benchmark Hardware x1: 16vCPU 2.5 GHz, 8 GB RAM, SSD DISK 5GB, 10 Gbps LAN

在cassandra.yaml中更改了默认设置:

cluster_name: 'tst'
seeds: "192.168.0.101,192.168.0.102,...108"
listen_address: 192.168.0.xxx
endpoint_snitch: GossipingPropertyFileSnitch
rpc_address: 192.168.0.xxx
concurrent_reads: 8
concurrent_writes: 8
concurrent_counter_writes: 8

KEYSPACE:

create keyspace tst WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : '2' }; 

示例表:

CREATE TABLE shares (
    c1 int PRIMARY KEY,
    c2 varchar,
    c3 int,
    c4 int,
    c5 int,
    c6 varchar,
    c7 int
);

测试中使用的示例性查询:

INSERT INTO shares (c1, c1, c3, c4, c5, c6, c7) VALUES (%s, '%s', %s, %s, %s, '%s', %s)

要与base连接,我将使用https://github.com/datastax/java-driver。在多线程中,我根据指令使用集群对象和会话对象之一。连接:

PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions.setConnectionsPerHost(HostDistance.LOCAL, 5, 300);
poolingOptions.setCoreConnectionsPerHost(HostDistance.LOCAL, 10);
poolingOptions.setPoolTimeoutMillis(5000);
QueryOptions queryOptions = new QueryOptions();
queryOptions.setConsistencyLevel(ConsistencyLevel.QUORUM);

Builder builder = Cluster.builder();
builder.withPoolingOptions(poolingOptions);
builder.withQueryOptions(queryOptions);
builder.withLoadBalancingPolicy(new RoundRobinPolicy());
this.setPoints(builder); // here all of the nodes are added
Cluster cluster = builder.build()

查询代码:

public ResultSet execute(String query) {
ResultSet result = this.session.execute(query);
return result;
}

在测试工作期间,所有节点上的内存使用率为80%,CPU为100%。我很惊讶在监视器中使用连接(太低):

[2016-09-10 09:39:51.537] /192.168.0.102:9042 connections=10, current load=62, max load=10240
[2016-09-10 09:39:51.556] /192.168.0.103:9042 connections=10, current load=106, max load=10240
[2016-09-10 09:39:51.556] /192.168.0.104:9042 connections=10, current load=104, max load=10240
[2016-09-10 09:39:51.556] /192.168.0.101:9042 connections=10, current load=196, max load=10240
[2016-09-10 09:39:56.467] /192.168.0.102:9042 connections=10, current load=109, max load=10240
[2016-09-10 09:39:56.467] /192.168.0.103:9042 connections=10, current load=107, max load=10240
[2016-09-10 09:39:56.467] /192.168.0.104:9042 connections=10, current load=115, max load=10240
[2016-09-10 09:39:56.468] /192.168.0.101:9042 connections=10, current load=169, max load=10240
[2016-09-10 09:40:01.468] /192.168.0.102:9042 connections=10, current load=113, max load=10240
[2016-09-10 09:40:01.468] /192.168.0.103:9042 connections=10, current load=84, max load=10240
[2016-09-10 09:40:01.468] /192.168.0.104:9042 connections=10, current load=92, max load=10240
[2016-09-10 09:40:01.469] /192.168.0.101:9042 connections=10, current load=205, max load=10240

监视器代码:https://github.com/datastax/java-driver/tree/3.0/manual/pooling#monitoring-and-tuning-the-pool

我正在尝试测试几个NoSQL数据库的可扩展性。在Redis基础上它是线性可扩展性,在这里她根本不是,我不知道为什么。谢谢你的帮助!

1 个答案:

答案 0 :(得分:3)

每台机器上的1GB RAM是一个非常低的目标。这可能会导致GC压力过高。检查您的日志以查看GC活动,并尝试了解此100%CPU上限是否始终归因于JVM GC。

另一个怪癖:你在每台机器上运行了多少个线程?如果您尝试使用此代码(您的代码)进行扩展:

  

查询代码:

public ResultSet execute(String query) {
ResultSet result = this.session.execute(query);
return result;
}
然后你不会走得很远。同步查询无可救药地缓慢。即使你试图使用更多的线程,那么1GB的RAM可能(我已经知道它......)太低了......你应该为资源消耗和可扩展性编写异步查询。