Cassandra节点在重写时失败

时间:2013-10-22 22:56:27

标签: c# linq cassandra

我是Cassandra的新手并测试它的写入负载,但我遇到了Cassandra稳定性的问题。首先,关于环境的一些信息:

  • Windows(在使用Windows 7和8的PC以及Server 2008 R2和Server 2012上测试)
  • 使用Java 7 u 45(撰写此问题时的最新版本)
  • Cassandra 1.2.10
  • 使用Cassandra C#1.01驱动程序访问Cassandra
  • 无论群集大小如何(从群集中的1个节点到6个节点进行测试)都会出现问题。
  • 数据磁盘是SSD。

我正在编写的应用程序将处理非常大的信息数据集,需要具有Cassandra已知的高写入(和读取)功能。

示例

我正在使用的示例创建一个“test”对象,其中包含四个字段:ID(GUID),Name(文本),Insert_User(Text)和Insert_TimeStamp(TimeStamp)。该代码只是尝试以50,000个批量创建100万条记录。通常,写入过程在达到150,000到200,000条记录时失败。大多数情况下,发生写超时异常(超时设置为20秒)。有时,堆溢出,但我似乎通过调整cassandra.yaml文件中的缓存和刷新设置以及使用以下Java配置设置扩展堆来解决该问题:

-Xmx20G -Xms1G -Xss256K

我正在使用的代码在这里:

using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations;
using System.Diagnostics;
using System.Linq;
using System.Linq.Expressions;
using System.Text;
using System.Threading.Tasks;
using Cassandra;
using Cassandra.Data.Linq;

namespace TestLinq3
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WindowWidth = 160;

            Stopwatch sw = new Stopwatch();
            sw.Start();
            Console.WriteLine("Started at " + System.DateTime.Now);

            var cluster = Cluster.Builder()
                        .AddContactPoint("127.0.0.1")
                        .WithCredentials("cassandra", "cassandra")
                        .Build();

            Metadata metadata = cluster.Metadata;

            Console.WriteLine("Starting process...");

            var session = cluster.Connect();
            session.CreateKeyspaceIfNotExists("test");
            session.ChangeKeyspace("test");

            Console.WriteLine("Connected to keyspace...");

            var table = session.GetTable<Test>();
            table.CreateIfNotExists();

            List<Test> testlist = new List<Test>();

            for (int j = 0; j < 20; j++)
            {
                Console.WriteLine("Running j loop " + j.ToString());

                var batch = session.CreateBatch();

                for (int i = 0; i < 50000; i++)
                {
                    testlist.Add(new Test { id = System.Guid.NewGuid(), name = "Name " + i, insertUser = "cassandra", insertTimeStamp = System.DateTimeOffset.UtcNow });
                }

                batch.Append(from t in testlist select table.Insert(t));

                try
                {
                    batch.Execute();
                    //Flush();
                }
                catch (WriteTimeoutException ex)
                {
                    Console.WriteLine("WriteTimeoutException hit.  Waiting 20 seconds...");
                    Console.WriteLine(ex.StackTrace);
                    System.Threading.Thread.Sleep(60000);
                }

                batch = null;

                Console.WriteLine("Time elapsed since start is " + sw.Elapsed.Hours.ToString("00")+":"+sw.Elapsed.Minutes.ToString("00")+":"+sw.Elapsed.Seconds.ToString("00"));
            }

            var results = (from rows in table where rows.name == "Name 333" select rows).Execute().Count();

            Console.WriteLine(results);

            sw.Stop();

            Console.WriteLine("Processing time was " + sw.Elapsed.Hours.ToString("00") + ":" + sw.Elapsed.Minutes.ToString("00") + ":" + sw.Elapsed.Seconds.ToString("00") + ":" + sw.Elapsed.Milliseconds.ToString("00") + ".");

            Console.ReadLine();
        }

        [AllowFiltering]
        [Table("test")]
        public class Test
        {
            [PartitionKey]
            [Column("id")]
            public Guid id;
            [SecondaryIndex]
            [Column("name")]
            public string name;
            [SecondaryIndex]
            [Column("insert_user")]
            public string insertUser;
            [SecondaryIndex]
            [Column("insert_timestamp")]
            public DateTimeOffset insertTimeStamp;
        }
    }
}

我所拥有的Cassandra.yaml设置(删除注释以节省空间):

#Cassandra storage config YAML
cluster_name: 'DEV'
initial_token:
max_hint_window_in_ms: 10800000 # 3 hours
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
permissions_validity_in_ms: 2000
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
data_file_directories:
    - /var/lib/cassandra/data
commitlog_directory: /var/lib/cassandra/commitlog
disk_failure_policy: stop
key_cache_size_in_mb:
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
row_cache_provider: SerializingCacheProvider
saved_caches_directory: /var/lib/cassandra/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
        - seeds: "127.0.0.1"
flush_largest_memtables_at: 0.50
reduce_cache_sizes_at: 0.50
reduce_cache_capacity_to: 0.30
concurrent_reads: 32
concurrent_writes: 32
memtable_total_space_in_mb: 4096
memtable_flush_writers: 8
memtable_flush_queue_size: 4
trickle_fsync: false
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
listen_address: localhost
start_native_transport: true
native_transport_port: 9042
start_rpc: true
rpc_address: localhost
rpc_port: 9160
rpc_keepalive: true
rpc_server_type: sync
thrift_framed_transport_size_in_mb: 15
incremental_backups: false
snapshot_before_compaction: false
auto_snapshot: true
column_index_size_in_kb: 64
in_memory_compaction_limit_in_mb: 64
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 16
compaction_preheat_key_cache: true
read_request_timeout_in_ms: 10000
range_request_timeout_in_ms: 10000
write_request_timeout_in_ms: 10000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 10000
cross_node_timeout: false
endpoint_snitch: SimpleSnitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128
server_encryption_options:
    internode_encryption: none
    keystore: conf/.keystore
    keystore_password: cassandra
    truststore: conf/.truststore
    truststore_password: cassandra
client_encryption_options:
    enabled: false
    keystore: conf/.keystore
    keystore_password: cassandra
internode_compression: all
inter_dc_tcp_nodelay: true

这项工作的目标是使Cassandra节点保持稳定,即使每个节点需要较慢的写入速度(当前写入速度大约为每秒7,000条记录)。在Internet上搜索StackOverflow和其他位置后,我还没有找到针对此问题的正确修复程序,并且非常感谢Cassandra在高写入量环境中获得的任何反馈。

最佳,

汤姆

1 个答案:

答案 0 :(得分:0)

日志说什么?您是否看到此错误:堆是89.2615008651467%已满。您可能需要减少可记忆和/或缓存大小。

如果是这样,罪魁祸首可能是CASSANDRA-6107。问题是准备好的语句被缓存但是可以缓存多少查询的限制太高而且没有考虑每个语句的大小(如果你做了一个批处理的语句,它会比一个简单的选择高得多)这耗尽了节点的记忆。此外,由于语句被缓存,GC不会删除它导致OOM错误。