Question

我已将旧列家庭的记录迁移到新的列家庭。在testKeyspace。

CREATE KEYSPACE testkeyspace WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'DC1': '2',
  'DC2': '2'
};

Old Columnfamily Structure。

CREATE TABLE old_Columnfamily (
  scopeid bigint,
  formid bigint,
  time timestamp,
  ipaddress text,
  record_link_id bigint,
  user_ifuid bigint,
  value text,
  PRIMARY KEY ((scopeid, formid), time)
) WITH CLUSTERING ORDER BY (time DESC);
CREATE INDEX update_audit_id_idx ON old_Columnfamily (record_link_id);

CREATE INDEX update_audit_user_ifuid_idx ON old_Columnfamily (user_ifuid);

新的Columnfamily结构

CREATE TABLE new_Columnfamily (
  scopeid bigint,
  formid bigint,
  time timeuuid,
  ipaddress inet,
  operation int,
  record_id bigint,
  value text,
  ifuid bigint,
PRIMARY KEY ((scopeid), formid, time)
) WITH CLUSTERING ORDER BY (formid ASC, time DESC)

CREATE INDEX audit_operation_idx ON new_Columnfamily (operation);
CREATE INDEX audit_recordid_idx ON new_Columnfamily (record_id);
CREATE INDEX audit_zuid_idx ON new_Columnfamily (ifuid);

nodetool状态结果为

Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address              Load       Tokens  Owns   Host ID                               Rack
UN  172.xxx.xxx.x80   5.58 GB   256     15.5%  fda5181f-baf6-4c2d-9bf9-ecc4abc50c39  RAC1
UN  172.xxx.xxx.x29   6.63 GB   256     16.4%  12574f5e-538a-4386-8c34-c8603a7456be  RAC1
UN  172.xxx.xxx.x22  40.64 GB    256     17.2%  db390d80-161f-44fb-a9d8-536ea924533d  RAC1
Datacenter: DC2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address              Load       Tokens  Owns   Host ID                               Rack
UN  172.xxx.xxx.x20   4.65 GB   256     17.9%  fda5181f-baf6-4c2d-9bf9-ecc4abc50c39  RAC1
UN  172.xxx.xxx.x67   6.37 GB   256     16.7%  12574f5e-538a-4386-8c34-c8603a7456be  RAC1
UN  172.xxx.xxx.x23  6.93 GB    256     16.2%  db390d80-161f-44fb-a9d8-536ea924533d  RAC1

编辑： nodetool -h 172.xxx.xxx.x23 tpstats

s =  -ea -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms8G -Xmx8G -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k
Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                         0         0       24426641         0                 0
RequestResponseStage              0         0       48496365         0                 0
MutationStage                     0         0       15623599         0                 0
ReadRepairStage                   0         0        2562071         0                 0
ReplicateOnWriteStage             0         0              0         0                 0
GossipStage                       0         0        3268659         0                 0
AntiEntropyStage                  0         0              0         0                 0
MigrationStage                    0         0             32         0                 0
MemoryMeter                       0         0            371         0                 0
MemtablePostFlusher               0         0          32263         0                 0
FlushWriter                       0         0          18447         0              1080
MiscStage                         0         0              0         0                 0
PendingRangeCalculator            0         0              8         0                 0
commitlog_archiver                0         0              0         0                 0
InternalResponseStage             0         0             12         0                 0
HintedHandoff                     2         2           1194         0                 0

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
PAGED_RANGE                  0
BINARY                       0
READ                         0
MUTATION                     0
_TRACE                       0
REQUEST_RESPONSE             0

问题是：将数据从旧的列迁移到新的列家庭节点 172.xxx.xxx.x23 时出现故障。我停止了迁移并再次启动了节点，然后开始迁移。我注意到节点 172.xxx.xxx.x23 中的数据增长很快。

为什么会这样？请解释原因。提前谢谢。

Answer 1

我注意到old（（scopeid，formid），time）和新架构（（scopeid），formid，time）之间的分区键有所不同。由于您将分区键唯一性降低到一列，因此计算出的分区键可能会影响节点上的数据存储。如果您拥有来自数千条记录的相同scopeid，则所有记录都会在同一节点上，因为所有这些记录的分区键都是相同的。

一个节点中的Cassandra数据迅速增长

1 个答案: