Question

我们有一个由3个cassandra节点组成的集群。所有节点都工作正常，但获取结果非常慢。我在cql-shell中运行SELECT - 查询以获取~100k行，在开始向我显示第一个结果之前，预热可能需要30秒。

为什么会这样？有没有办法加快速度？

以下是跟踪日志：

activity                                                                                                                                              | timestamp    | source         | source_elapsed
-------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+----------------+----------------
                                                                                                                                   execute_cql3_query | 17:41:34,923 | *.*.*.211 |              0
Parsing SELECT * FROM ga_page_visits WHERE site_global_key='GLOBAL_KEY' AND reported_at < '2014-05-17' AND reported_at > '2014-05-01' LIMIT 100000; | 17:41:34,923 | *.*.*.211 |             87
                                                                                                                                  Preparing statement | 17:41:34,923 | *.*.*.211 |            290
                                                                                                                   Sending message to /*.*.*.213 | 17:41:34,924 | *.*.*.211 |           1579
                                                                                                                   Sending message to /*.*.*.212 | 17:41:34,924 | *.*.*.211 |           1617
                                                                                                            Executing single-partition query on users | 17:41:34,924 | *.*.*.211 |           1666
                                                                                                                Message received from /*.*.*.211 | 17:41:34,925 | *.*.*.213 |             48
                                                                                                                Message received from /*.*.*.211 | 17:41:34,925 | *.*.*.212 |             40
                                                                                                                         Acquiring sstable references | 17:41:34,925 | *.*.*.211 |           1704
                                                                                                                          Merging memtable tombstones | 17:41:34,925 | *.*.*.211 |           1767
                                                                                                                          Key cache hit for sstable 2 | 17:41:34,925 | *.*.*.211 |           1886
                                                                                                          Seeking to partition beginning in data file | 17:41:34,925 | *.*.*.211 |           1908
                                                                            Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | 17:41:34,925 | *.*.*.211 |           2337
                                                                                                           Merging data from memtables and 1 sstables | 17:41:34,925 | *.*.*.211 |           2366
                                                                                                                   Read 1 live and 0 tombstoned cells | 17:41:34,925 | *.*.*.211 |           2410
                                                                                                            Executing single-partition query on users | 17:41:34,926 | *.*.*.213 |            702
                                                                                                            Executing single-partition query on users | 17:41:34,926 | *.*.*.212 |            813
                                                                                                                         Acquiring sstable references | 17:41:34,926 | *.*.*.213 |            747
                                                                                                                         Acquiring sstable references | 17:41:34,926 | *.*.*.212 |            848
                                                                                                                          Merging memtable tombstones | 17:41:34,926 | *.*.*.213 |            838
                                                                                                                          Merging memtable tombstones | 17:41:34,926 | *.*.*.212 |            922
                                                                                                                          Key cache hit for sstable 5 | 17:41:34,926 | *.*.*.213 |           1006
                                                                                                                          Key cache hit for sstable 1 | 17:41:34,926 | *.*.*.212 |           1044
                                                                                                          Seeking to partition beginning in data file | 17:41:34,926 | *.*.*.213 |           1034
                                                                                                          Seeking to partition beginning in data file | 17:41:34,926 | *.*.*.212 |           1066
                                                                            Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | 17:41:34,927 | *.*.*.213 |           1635
                                                                            Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | 17:41:34,927 | *.*.*.212 |           1543
                                                                                                           Merging data from memtables and 1 sstables | 17:41:34,927 | *.*.*.213 |           1676
                                                                                                           Merging data from memtables and 1 sstables | 17:41:34,927 | *.*.*.212 |           1571
                                                                                                                   Read 1 live and 0 tombstoned cells | 17:41:34,927 | *.*.*.213 |           1760
                                                                                                                   Read 1 live and 0 tombstoned cells | 17:41:34,927 | *.*.*.212 |           1634
                                                                                                                Enqueuing response to /*.*.*.211 | 17:41:34,927 | *.*.*.213 |           2014
                                                                                                                Enqueuing response to /*.*.*.211 | 17:41:34,927 | *.*.*.212 |           1846
                                                                                                                   Sending message to /*.*.*.211 | 17:41:34,927 | *.*.*.213 |           2173
                                                                                                                   Sending message to /*.*.*.211 | 17:41:34,927 | *.*.*.212 |           1960
                                                                                                                Message received from /*.*.*.213 | 17:41:34,928 | *.*.*.211 |           4759
                                                                                                                Message received from /*.*.*.212 | 17:41:34,928 | *.*.*.211 |           4775
                                                                                                             Processing response from /*.*.*.213 | 17:41:34,928 | *.*.*.211 |           5439
                                                                                                             Processing response from /*.*.*.212 | 17:41:34,928 | *.*.*.211 |           5668
                                                                                                                     Sending message to /*.*.*.40 | 17:41:34,929 | *.*.*.211 |           5954
                                                                                                   Executing single-partition query on ga_page_visits | 17:41:34,930 | *.*.*.211 |           6792
                                                                                                                         Acquiring sstable references | 17:41:34,930 | *.*.*.211 |           6868
                                                                                                                          Merging memtable tombstones | 17:41:34,930 | *.*.*.211 |           6936
                                                                                                                          Key cache hit for sstable 2 | 17:41:34,930 | *.*.*.211 |           7075
                                                                                                    Seeking to partition indexed section in data file | 17:41:34,930 | *.*.*.211 |           7102
                                                                                                                          Key cache hit for sstable 1 | 17:41:34,930 | *.*.*.211 |           7211
                                                                                                    Seeking to partition indexed section in data file | 17:41:34,930 | *.*.*.211 |           7232
                                                                            Skipped 1/3 non-slice-intersecting sstables, included 0 due to tombstones | 17:41:34,930 | *.*.*.211 |           7290
                                                                                                           Merging data from memtables and 2 sstables | 17:41:34,930 | *.*.*.211 |           7311
                                                                                                                Message received from /*.*.*.211 | 17:41:34,964 |   *.*.*.40 |             45
                                                                                                            Executing single-partition query on users | 17:41:34,965 |   *.*.*.40 |            658
                                                                                                                         Acquiring sstable references | 17:41:34,965 |   *.*.*.40 |            695
                                                                                                                          Merging memtable tombstones | 17:41:34,965 |   *.*.*.40 |            762
                                                                                                                         Key cache hit for sstable 14 | 17:41:34,965 |   *.*.*.40 |            852
                                                                                                          Seeking to partition beginning in data file | 17:41:34,965 |   *.*.*.40 |            876
                                                                                                                         Key cache hit for sstable 16 | 17:41:34,965 |   *.*.*.40 |           1275
                                                                                                          Seeking to partition beginning in data file | 17:41:34,965 |   *.*.*.40 |           1293
                                                                            Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | 17:41:34,966 |   *.*.*.40 |           1504
                                                                                                           Merging data from memtables and 2 sstables | 17:41:34,966 |   *.*.*.40 |           1526
                                                                                                                   Read 1 live and 0 tombstoned cells | 17:41:34,966 |   *.*.*.40 |           1592
                                                                                                                Enqueuing response to /*.*.*.211 | 17:41:34,966 |   *.*.*.40 |           1755
                                                                                                                   Sending message to /*.*.*.211 | 17:41:34,966 |   *.*.*.40 |           1834
                                                                                                                  Message received from /*.*.*.40 | 17:41:34,980 | *.*.*.211 |          57679
                                                                                                               Processing response from /*.*.*.40 | 17:41:34,981 | *.*.*.211 |          57785
                                                                                                               Read 99880 live and 0 tombstoned cells | 17:41:36,315 | *.*.*.211 |        1392676
                                                                                                                                     Request complete | 17:41:37,541 | *.*.*.211 |        2618390

表架构是：

CREATE TABLE ga_page_visits (
  site_global_key ascii,
  reported_at timestamp,
  timeuuid ascii,
  bounces int,
  campaign text,
  channel ascii,
  conversions int,
  device ascii,
  keyword text,
  medium text,
  new_visits int,
  page_id int,
  page_views int,
  referral_path text,
  site_search_engine_id int,
  social_network text,
  source text,
  value decimal,
  visits int,
  PRIMARY KEY (site_global_key, reported_at, timeuuid)
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

Answer 1

我怀疑您的数据模型存在问题。看起来你正在将绝对数量的数据推入单个分区。随着数据集变大，这将成为一个问题。我推荐这种类型的结构：

CREATE TABLE ga_page_visits (
  site_global_key ascii,
  day timestamp,
  timeuuid ascii,
  bounces int,
  campaign text,
  channel ascii,
  conversions int,
  device ascii,
  keyword text,
  medium text,
  new_visits int,
  page_id int,
  page_views int,
  referral_path text,
  site_search_engine_id int,
  social_network text,
  source text,
  value decimal,
  visits int,
  PRIMARY KEY ((site_global_key, day), timeuuid)
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

您已经有一个timeuuid，其中嵌入了一个时间，因此您不需要reporting_at时间戳。

您将在此分区中拥有可笑的行数。您应该使用site_global_key＆amp;的复合主键。日期（不要加时间，让它为00:00:00。）

这样，每天都会生活在一个单独的分区上，并且更容易查询。每天做1次查询。这样做会好得多。

Answer 2

在我查看您的模型要求时，这看起来就像数据仓库类型设计。乔恩用汇总把它钉上了。如果您需要低延迟，高频率，则通常在具有高扩展要求的生产中创建数据的更紧密视图。我不想喷出链接，但有一些例子，如果你谷歌。

从Cassandra获取数据太慢了

2 个答案: