我们的服务能够在本地和已部署的Cassandra实例上运行SELECT
和INSERT
查询,而不会出现任何问题。
但是,我们在以下DELETE
查询中遇到了问题:
DELETE FROM config_by_uuid WHERE uuid = record_uuid;
我们的服务能够成功删除本地实例上的记录,但不能删除已部署实例上的记录。请注意,此行为在两个实例中都是恒定的,并且在部署的实例上未报告任何错误。
值得注意的是,当上述查询通过cqlsh
在我们部署的实例上运行时,它成功删除了一条记录。它仅在从我们的服务在已部署实例上运行时失败。我们的服务和cqlsh
使用同一用户来运行查询。
最初,我们怀疑这可能是Cassandra一致性问题,因此我们尝试在cqlsh
和ONE
的一致性级别以及两个一致性级别的QUORUM
上运行查询查询成功。请注意,我们的服务当前正在使用QUORUM
进行所有操作。
我们之所以不赞成将其作为代码问题,是因为该服务按预期在我们的本地实例上运行。我们的理由是,如果这是代码问题,则这两个实例都应该失败,因此区别必须在我们的Cassandra安装中。两个实例都使用Cassandra 3.11.X
。
两个实例的键空间和表详细信息相同,如下所示(请注意,由于我们仍处于开发的早期阶段,我们目前仅使用单个节点):
CREATE KEYSPACE config WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;
CREATE TABLE config.config_by_uuid (
uuid uuid PRIMARY KEY,
config_name text,
config_value text,
service_uuid uuid,
tenant_uuid uuid,
user_uuid uuid
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
我们已启用对已部署的Cassandra的跟踪,以下是通过cqlsh
运行查询时的详细信息:
system_traces.sessions:
session_id: 25b48ce0-0491-11ea-ace9-5db0758d00f3
client: node_ip
command: QUERY
coordinator: node_ip
duration: 1875
parameters: {'consistency_level': 'ONE', 'page_size': '100', 'query': 'delete from config_by_uuid where uuid = 96ac4699-5199-4a80-9c59-b592d28ea2b7;', 'serial_consistency_level': 'SERIAL'}
request: Execute CQL3 query
started_at: 2019-11-11 14:40:03.758000+0000
system_traces.events:
session_id | event_id | activity | source | source_elapsed | thread
--------------------------------------+--------------------------------------+---------------------------------------------------------------------------------------+--------------+----------------+-----------------------------
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4b3f0-0491-11ea-ace9-5db0758d00f3 | Parsing delete from config_by_uuid where uuid = 96ac4699-5199-4a80-9c59-b592d28ea2b7; | node_ip | 203 | Native-Transport-Requests-1
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4b3f1-0491-11ea-ace9-5db0758d00f3 | Preparing statement | node_ip | 381 | Native-Transport-Requests-1
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4b3f2-0491-11ea-ace9-5db0758d00f3 | Executing single-partition query on roles | node_ip | 1044 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4b3f3-0491-11ea-ace9-5db0758d00f3 | Acquiring sstable references | node_ip | 1080 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db00-0491-11ea-ace9-5db0758d00f3 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node_ip | 1114 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db01-0491-11ea-ace9-5db0758d00f3 | Key cache hit for sstable 2 | node_ip | 1152 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db02-0491-11ea-ace9-5db0758d00f3 | Merged data from memtables and 1 sstables | node_ip | 1276 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db03-0491-11ea-ace9-5db0758d00f3 | Read 1 live rows and 0 tombstone cells | node_ip | 1307 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db04-0491-11ea-ace9-5db0758d00f3 | Executing single-partition query on roles | node_ip | 1466 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db05-0491-11ea-ace9-5db0758d00f3 | Acquiring sstable references | node_ip | 1484 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db06-0491-11ea-ace9-5db0758d00f3 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node_ip | 1501 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db07-0491-11ea-ace9-5db0758d00f3 | Key cache hit for sstable 2 | node_ip | 1525 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db08-0491-11ea-ace9-5db0758d00f3 | Merged data from memtables and 1 sstables | node_ip | 1573 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db09-0491-11ea-ace9-5db0758d00f3 | Read 1 live rows and 0 tombstone cells | node_ip | 1593 | ReadStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db0a-0491-11ea-ace9-5db0758d00f3 | Determining replicas for mutation | node_ip | 1743 | Native-Transport-Requests-1
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db0b-0491-11ea-ace9-5db0758d00f3 | Appending to commitlog | node_ip | 1796 | MutationStage-3
25b48ce0-0491-11ea-ace9-5db0758d00f3 | 25b4db0c-0491-11ea-ace9-5db0758d00f3 | Adding to config_by_uuid memtable | node_ip | 1827 | MutationStage-3
以下是从我们的服务运行查询时的详细信息:
system_traces.sessions:
session_id: 9ed67270-048f-11ea-ace9-5db0758d00f3
client: service_ip
command: QUERY
coordinator: node_ip
duration: 3247
parameters: {'bound_var_0_uuid': '19e12033-5ad4-4376-8293-315a26370d93', 'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 'DELETE FROM config.config_by_uuid WHERE uuid=? ', 'serial_consistency_level': 'SERIAL'}
request: Execute CQL3 prepared query
started_at: 2019-11-11 14:29:07.991000+0000
system_traces.events:
session_id | event_id | activity | source | source_elapsed | thread
--------------------------------------+--------------------------------------+---------------------------------------------------------------------------+--------------+----------------+-----------------------------
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed67271-048f-11ea-ace9-5db0758d00f3 | Executing single-partition query on roles | node_ip | 178 | ReadStage-2
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed67272-048f-11ea-ace9-5db0758d00f3 | Acquiring sstable references | node_ip | 204 | ReadStage-2
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed67273-048f-11ea-ace9-5db0758d00f3 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node_ip | 368 | ReadStage-2
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed69980-048f-11ea-ace9-5db0758d00f3 | Key cache hit for sstable 2 | node_ip | 553 | ReadStage-2
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed69981-048f-11ea-ace9-5db0758d00f3 | Merged data from memtables and 1 sstables | node_ip | 922 | ReadStage-2
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed69982-048f-11ea-ace9-5db0758d00f3 | Read 1 live rows and 0 tombstone cells | node_ip | 1193 | ReadStage-2
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed6c090-048f-11ea-ace9-5db0758d00f3 | Executing single-partition query on roles | node_ip | 1587 | ReadStage-3
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed6c091-048f-11ea-ace9-5db0758d00f3 | Acquiring sstable references | node_ip | 1642 | ReadStage-3
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed6c092-048f-11ea-ace9-5db0758d00f3 | Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | node_ip | 1708 | ReadStage-3
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed6c093-048f-11ea-ace9-5db0758d00f3 | Key cache hit for sstable 2 | node_ip | 1750 | ReadStage-3
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed6c094-048f-11ea-ace9-5db0758d00f3 | Merged data from memtables and 1 sstables | node_ip | 1845 | ReadStage-3
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed6c095-048f-11ea-ace9-5db0758d00f3 | Read 1 live rows and 0 tombstone cells | node_ip | 1888 | ReadStage-3
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed6e7a0-048f-11ea-ace9-5db0758d00f3 | Determining replicas for mutation | node_ip | 2660 | Native-Transport-Requests-1
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed6e7a1-048f-11ea-ace9-5db0758d00f3 | Appending to commitlog | node_ip | 3028 | MutationStage-2
9ed67270-048f-11ea-ace9-5db0758d00f3 | 9ed6e7a2-048f-11ea-ace9-5db0758d00f3 | Adding to config_by_uuid memtable | node_ip | 3133 | MutationStage-2
以下是我们在Windows 10上安装本地Cassandra的步骤。请注意,安装后未更改任何配置文件:
已安装的Java8。java -version
和javac -version
都可以正常工作。
已安装Python 2。python --version
正在运行。
从以下位置下载了最新的Cassandra bin.tar.gz
文件:
http://cassandra.apache.org/download/
提取zip文件的内容,将其重命名为cassandra
,并将其放置在C:\
中。
将C:\cassandra\bin
添加到了我们的PATH环境变量中。
以下是我们在CentOS 8上安装已部署的Cassandra的步骤:
更新yum:
yum -y update
安装Java:
yum -y install java
java -version
创建供yum使用的回购文件:
nano /etc/yum.repos.d/cassandra.repo
---
[cassandra]
name=Apache Cassandra
baseurl=https://www.apache.org/dist/cassandra/redhat/311x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://www.apache.org/dist/cassandra/KEYS
安装Cassandra:
yum -y install cassandra
为Cassandra创建服务文件:
nano /etc/systemd/system/cassandra.service
---
[Unit]
Description=Apache Cassandra
After=network.target
[Service]
PIDFile=/var/run/cassandra/cassandra.pid
User=cassandra
Group=cassandra
ExecStart=/usr/sbin/cassandra -f -p /var/run/cassandra/cassandra.pid
Restart=always
[Install]
WantedBy=multi-user.target
重新加载系统守护程序:
systemctl daemon-reload
授予Cassandra目录权限:
sudo chown -R cassandra:cassandra /var/lib/cassandra
sudo chown -R cassandra:cassandra /var/log/cassandra
配置系统以在启动时运行Cassandra:
systemctl enable cassandra
配置cassandra.yaml文件:
nano /etc/cassandra/conf/cassandra.yaml
---
(TIP: Use Ctrl+W to search for the settings you want to change.)
authenticator: org.apache.cassandra.auth.PasswordAuthenticator
authorizer: org.apache.cassandra.auth.CassandraAuthorizer
role_manager: CassandraRoleManager
roles_validity_in_ms: 0
permissions_validity_in_ms: 0
cluster_name: 'MyCompany Dev'
initial_token: (should be commented-out)
listen_address: node_ip
rpc_address: node_ip
endpoint_snitch: GossipingPropertyFileSnitch
auto_bootstrap: false (add this at the bottom of the file)
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "node_ip"
配置cassandra-topology.properties文件:
nano /etc/cassandra/conf/cassandra-topology.properties
---
(NOTE: For "Cassandra Node IP=Data Center:Rack", delete all existing values.)
#Cassandra Node IP=Data Center:Rack
[Local IP]=SG:Dev
# default for unknown nodes
default=SG:Dev
配置cassandra-rackdc.properties文件:
nano /etc/cassandra/conf/cassandra-rackdc.properties
---
dc=SG
rack=Dev
运行以下命令以清理目录:
rm -rf /var/lib/cassandra/data
rm -rf /var/lib/cassandra/commitlog
rm -rf /var/lib/cassandra/saved_caches
rm -rf /var/lib/cassandra/hints
启动Cassandra:
service cassandra start
安装Python 2:
yum -y install python2
python2 --version
以默认用户身份登录:
cqlsh -u cassandra -p cassandra node_ip --request-timeout=6000
创建新用户:
CREATE ROLE adminuser WITH PASSWORD = 'password' AND SUPERUSER = true AND LOGIN = true;
exit;
以新用户身份登录:
cqlsh -u adminuser -p password node_ip --request-timeout=6000
禁用默认用户:
ALTER ROLE cassandra WITH PASSWORD = 'cassandra' AND SUPERUSER = false AND LOGIN = false;
REVOKE ALL PERMISSIONS ON ALL KEYSPACES FROM cassandra;
GRANT ALL PERMISSIONS ON ALL KEYSPACES TO adminuser;
exit;
我们的服务是用Golang编写的,并且正在使用以下第三方库与Cassandra进行通信:
github.com/gocql/gocql
github.com/scylladb/gocqlx
github.com/scylladb/gocqlx/qb
UPDATE 1: 以下是我们的服务和cqlsh
用于运行查询(通过list all permissions on config.config_by_uuid;
)的用户的权限:
role | username | resource | permission
----------+-----------+-------------------------------+------------
adminuser | adminuser | <all keyspaces> | CREATE
adminuser | adminuser | <all keyspaces> | ALTER
adminuser | adminuser | <all keyspaces> | DROP
adminuser | adminuser | <all keyspaces> | SELECT
adminuser | adminuser | <all keyspaces> | MODIFY
adminuser | adminuser | <all keyspaces> | AUTHORIZE
adminuser | adminuser | <keyspace config> | CREATE
adminuser | adminuser | <keyspace config> | ALTER
adminuser | adminuser | <keyspace config> | DROP
adminuser | adminuser | <keyspace config> | SELECT
adminuser | adminuser | <keyspace config> | MODIFY
adminuser | adminuser | <keyspace config> | AUTHORIZE
adminuser | adminuser | <table config.config_by_uuid> | ALTER
adminuser | adminuser | <table config.config_by_uuid> | DROP
adminuser | adminuser | <table config.config_by_uuid> | SELECT
adminuser | adminuser | <table config.config_by_uuid> | MODIFY
adminuser | adminuser | <table config.config_by_uuid> | AUTHORIZE
Cassandra文档指出MODIFY
授予以下权限:INSERT
,DELETE
,UPDATE
,TRUNCATE
。因为adminuser
可以插入记录而没有任何问题,所以看来我们的删除问题不是权限问题。
更新2: 以下是关键Cassandra目录的所有者和权限(通过ls -al
):
/ etc / cassandra:
total 20
drwxr-xr-x 3 root root 4096 Nov 12 22:18 .
drwxr-xr-x. 103 root root 12288 Nov 12 22:18 ..
lrwxrwxrwx 1 root root 27 Nov 12 22:18 conf -> /etc/alternatives/cassandra
drwxr-xr-x 3 root root 4096 Nov 12 22:18 default.conf
/ var / lib / cassandra:
total 24
drwxr-xr-x 6 cassandra cassandra 4096 Nov 12 22:38 .
drwxr-xr-x. 43 root root 4096 Nov 12 22:18 ..
drwxr-xr-x 2 cassandra cassandra 4096 Nov 12 22:38 commitlog
drwxr-xr-x 8 cassandra cassandra 4096 Nov 12 22:40 data
drwxr-xr-x 2 cassandra cassandra 4096 Nov 12 22:38 hints
drwxr-xr-x 2 cassandra cassandra 4096 Nov 12 22:38 saved_caches
/ var / log / cassandra:
total 3788
drwxr-xr-x 2 cassandra cassandra 4096 Nov 12 22:19 .
drwxr-xr-x. 11 root root 4096 Nov 12 22:18 ..
-rw-r--r-- 1 cassandra cassandra 2661056 Nov 12 22:41 debug.log
-rw-r--r-- 1 cassandra cassandra 52623 Nov 12 23:11 gc.log.0.current
-rw-r--r-- 1 cassandra cassandra 1141764 Nov 12 22:40 system.log
更新3: 我们也怀疑这是一个tombstone
或compaction
问题,因此我们尝试将gc_grace_seconds
设置为{ {1}}并运行了删除查询,但也没有帮助。
在0
设置为nodetool compact -s config config_by_uuid
并且默认gc_grace_seconds
的情况下运行0
也没有帮助。
更新4: :我们尝试卸载并重新安装Cassandra,但未能解决问题。下面是我们使用的步骤:
通过yum卸载Cassandra:
864000
删除以下目录:
yum -y remove cassandra
删除了所有剩余文件:
(注意:对以下命令的结果执行rm -rf /var/lib/cassandra
rm -rf /var/log/cassandra
rm -rf /etc/cassandra
。)
rm -rf
例如
find / -name 'cassandra'
find / -name '*cassandra*'
更新5: 此问题发生在我们的rm -rf /run/lock/subsys/cassandra
rm -rf /tmp/hsperfdata_cassandra
rm -rf /etc/rc.d/rc3.d/S80cassandra
rm -rf /etc/rc.d/rc2.d/S80cassandra
rm -rf /etc/rc.d/rc0.d/K20cassandra
rm -rf /etc/rc.d/rc6.d/K20cassandra
rm -rf /etc/rc.d/rc5.d/S80cassandra
rm -rf /etc/rc.d/rc4.d/S80cassandra
rm -rf /etc/rc.d/rc1.d/K20cassandra
rm -rf /root/.cassandra
rm -rf /var/cache/dnf/cassandra-e96532ac33a46b7e
rm -rf /var/cache/dnf/cassandra.solv
rm -rf /var/cache/dnf/cassandra-filenames.solvx
rm -rf /run/systemd/generator.late/graphical.target.wants/cassandra.service
rm -rf /run/systemd/generator.late/multi-user.target.wants/cassandra.service
rm -rf /run/systemd/generator.late/cassandra.service
安装的CentOS上,因此我们接下来尝试了Server
。令人惊讶的是,在最小安装中未发生此问题。我们目前正在调查可能存在的差异。
更新6: 我们尝试再创建一个服务器,这次还选择了Minimal Install
安装的CentOS。令人惊讶的是,此服务器上也未发生此问题,因此CentOS安装类型也与我们的问题无关。
有了这个,我们已经确认是我们的Cassandra安装出现了问题,尽管我们还不能确定我们做错了什么,甚至卸载和重新安装也无法解决原始服务器上的问题。
也许我们上面的卸载步骤还不够彻底?
UPDATE 7: 事实证明,新服务器没有此问题的原因是,原始服务器使用的是自定义的CentOS ISO而不是普通的ISO。 。我们的团队成员之一正在研究使自定义ISO与众不同的原因,当他们回到我们这里时,我将更新此问题。
更新8: 事实证明,我们使用的所谓的普通CentOS ISO中也存在此问题,并且由于定制ISO基于此,当前所有服务器都有问题。
但是,为了使此问题发生,需要使用Server
命令重新引导服务器。此命令交替显示是否发生问题(重新引导1,没有问题;重新引导2,发生问题;重新引导3,没有问题)。
我们的一名团队成员目前正在调查我们是否使用了错误的CentOS ISO。我们也在考虑ISO合格的可能性,但是问题可能出在我们的虚拟机环境上。
更新9: 从reboot
下载了未定制的CentOS ISO CentOS-8-x86_64-1905-dvd1.iso
。我们已经验证了它的校验和,并确认ISO完全与CentOS官方网站中的ISO相同。
有了这个,我们隔离出问题出在我们的虚拟机环境上。
我们正在使用centos.org
创建承载Cassandra的虚拟机。
我们的虚拟机详细信息如下:
操作系统详细信息:
vmware ESXi
存储详细信息:
Compatibility: ESXi 6.7 virtual machine
Guest OS family: Linux
Guest OS version: CentOS 8 (64-bit)
数据存储详细信息:
Type: Standard (choices were `Standard` and `Persistent Memory`)
虚拟机设置:
Capacity: 886.75 GB
Free: 294.09 GB
Type: VMFS6
Thin provisioning: Supported
Access: Single
生成的摘要:
CPU: 1
(choices: 1-32)
Memory: 2048 MB
Hard disk 1: 16 GB
Maximum Size: 294.09 GB
Location: [datastore1] virtual_machine_name
Disk Provisioning: Thin Provisioned
(choices: Thin provisioned; Thick provisioned, lazily zeroed; Thick provisioned, eagerly zeroed)
Shares:
Type: Normal
(choices: Low, Normal, High, Custom)
Value: 1000
Limit - IOPs: Unlimited
Controller location: SCSI controller 0
(choices: IDE controller 0; IDE controller 1; SCSI controller 0; SATA controller 0)
Virtual Device Node unit: SCSI (0:0)
(choices: SCSI (0:0) to (0:64))
Disk mode: Dependent
(choices: Dependent; Independent - persistent; Independent - Non-persistent)
Sharing: None
(Disk sharing is only possible with eagerly zeroed, thick provisioned disks.)
SCSI Controller 0: VMware Paravirtual
(choices: LSI Logic SAS; LSI Logic Parallel; VMware Paravirtual)
SATA Controller 0: (no options)
USB controller 1: USB 2.0
(choices: USB 2.0; USB 3.0)
Network Adapter 1: our_domain
Connect: (checked)
CD/DVD Drive 1: Datastore ISO File (CentOS-8-x86_64-1905-dvd1.iso)
(choices: Host device; Datastore ISO File)
Connect: (checked)
Video Card: Default settings
(choices: Default settings; Specify custom settings)
非常感谢大家抽出宝贵时间阅读本期特刊!
答案 0 :(得分:1)
可能是权限问题。检查以下命令的结果:
cqlsh> list all permissions on config.config_by_uuid;
This blog来自Datastax,其中提供了有关Cassandra中身份验证和授权的详细信息。