我正在.Net C#中实现推荐引擎,我正在使用Cassandra来存储数据。我仍然是使用C *的新手,刚刚开始使用它2个月前。目前,我的群集中只有2个节点(单个DC),部署在Azure DS2 VM中(每个都有7Gb RAM,2个核心)。我为读取和写入设置了RF=2, CL=1
。我在yaml配置文件中设置了超时时间
read_request_timeout_in_ms: 60000
write_request_timeout_in_ms: 120000
counter_write_request_timeout_in_ms: 120000
request_timeout_in_ms: 120000
我在客户端设置了较低的读取查询超时(每个30秒)。 存储在cassandra中的数据是用户历史记录,项目计数器和推荐项目数据。我为我的推荐引擎创建了一个API(代表equinix DC),它的工作非常简单,每次用户打开网站页面时,只读取C *中recommended_items表中的所有recommended_items Id。这意味着每个用户的查询都非常简单:
select * from recommended_items where username = <username>
当我为多达500个用户/线程加载测试时,它很好并且非常快。但是当在线站点调用API从C *表中读取时,我经常得到读取超时。但是,同时通常只有不到20个用户。 我使用DataDog监视cassandra节点活动,我发现只有节点#2不断超时(种子节点是节点#1,虽然我理解的是种子并不重要,除非在引导步骤中)。但是,每次超时发生时,我都尝试在两个节点中使用cqlsh进行查询,而节点#1是返回的
OperationTimeOut Exception。
我一直试图找到这个问题的主要根源。这与协调器节点关闭(I read this article)有什么关系吗?或者是因为我只有2个节点?
当超时发生时(网页没有显示任何内容),然后我尝试刷新调用API的页面,它将在很长时间内加载,然后再显示任何内容(因为超时)。但令人惊讶的是,即使网页已关闭,我也会在几分钟后得到所有这些请求实际成功的日志。即使页面已关闭,它就像读取请求仍在运行。
例外就像这些(它们没有一起发生):
None of the hosts tried for query are available (tried: 13.73.193.140:9042,13.75.154.140:9042)
OR
Cassandra timeout during read query at consistency LocalOne (0 replica(s) responded over 1 required)
有没有人对我的问题有任何建议?谢谢。
cfstats .recommended_items
的输出NODE#1
Read Count: 683
Read Latency: 2.970781844802343 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Flushes: 0
Table: recommendedvideos
Space used (live): 96034775
Space used (total): 96034775
Space used by snapshots (total): 40345163
Off heap memory used (total): 192269
SSTable Compression Ratio: 0.4405242717559795
Number of keys (estimate): 101493
Memtable cell count: 0
Memtable data size: 0
Memtable off heap memory used: 0
Memtable switch count: 0
Local read count: 376
Local read latency: 1.647 ms
Local write count: 0
Local write latency: NaN ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 126928
Bloom filter off heap memory used: 126896
Index summary off heap memory used: 40085
Compression metadata off heap memory used: 25288
Compacted partition minimum bytes: 43
Compacted partition maximum bytes: 454826
Compacted partition mean bytes: 2201
Average live cells per slice (last five minutes): 160.28657799274487
Maximum live cells per slice (last five minutes): 2759
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
Dropped Mutations: 0
NODE#2
Read Count: 733
Read Latency: 3.0032783083219647 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Flushes: 0
Table: recommendedvideos
Space used (live): 99145806
Space used (total): 99145806
Space used by snapshots (total): 15101127
Off heap memory used (total): 196008
SSTable Compression Ratio: 0.44063804831658704
Number of keys (estimate): 103863
Memtable cell count: 0
Memtable data size: 0
Memtable off heap memory used: 0
Memtable switch count: 0
Local read count: 453
Local read latency: 1.344 ms
Local write count: 0
Local write latency: NaN ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 129056
Bloom filter off heap memory used: 129040
Index summary off heap memory used: 40856
Compression metadata off heap memory used: 26112
Compacted partition minimum bytes: 43
Compacted partition maximum bytes: 454826
Compacted partition mean bytes: 2264
Average live cells per slice (last five minutes): 170.7715877437326
Maximum live cells per slice (last five minutes): 2759
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
Dropped Mutations: 0