我在Amazon EC2上设置了具有5个节点的群集,具有多个区域中心。 ops中心节点/实例与群集节点分开。 当我尝试通过opscenter Web添加现有群集时,它会显示"创建群集时出错:添加群集时超时。请查看日志以获取有关问题的详细信息。" 在网上。然后我检查了opscenterd.log,似乎opscenter可以连接两个节点,但是一个警告:"在调用CreateClusterConfController时处理错误:添加集群时超时。请查看日志以获取有关问题的详细信息。"
你对这个问题有什么想法吗?我使用的是DataStax Enterprise 4.0.2版,Cassandra 2.0.6和Opscenter 4.1.2。我在Ubuntu 12.0.4上创建集群 我检查了Cassandra系统日志和datastax-agent代理日志,但没有错误。
这是否存在任何问题?喜欢opscenter版本4.1.0问题"使用Python 2.6和#34;在平台上更新定义文件时,opscenterd崩溃;这已在4.1.1中修复。 http://www.datastax.com/documentation/opscenter/4.1/opsc/release_notes/opscReleaseNotes411.html
请建议。
=============================================== =================================== 所有端口都在ec2安全组上打开,(61620,61621 ..等)>我使用端口61621进行了从操作系统到主机的telnet,从主机到操作系统的telnet,端口61620都连接了。 以下是opscenter.log
2014-05-14 05:53:46+0000 [] INFO: Starting factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x4076c68>
2014-05-14 05:53:46+0000 [] INFO: Adding new cluster 'Connect2me': {u'jmx': {u'username': u'', u'password': u'', u'port': u'7199'}, 'kerberos_client_principals': {}, 'kerberos': {}, u'agents': {}, 'kerberos_hostnames': {}, 'kerberos_services': {}, u'cassandra': {u'username': u'', u'seed_hosts': u'54.214.1.100', u'api_port': u'9160', u'password': u''}}
2014-05-14 05:53:46+0000 [] INFO: Starting new cluster services for Connect2me
2014-05-14 05:53:46+0000 [Connect2me] INFO: Starting services for cluster Connect2me
2014-05-14 05:53:46+0000 [Connect2me] INFO: Loading event plugins
2014-05-14 05:53:46+0000 [Connect2me] INFO: Loading event plugin conf /etc/opscenter/event-plugins/posturl.conf
2014-05-14 05:53:46+0000 [Connect2me] INFO: Successfully loaded event plugin conf /etc/opscenter/event-plugins/posturl.conf
2014-05-14 05:53:46+0000 [Connect2me] INFO: Loading event plugin conf /etc/opscenter/event-plugins/email.conf
2014-05-14 05:53:46+0000 [Connect2me] INFO: Successfully loaded event plugin conf /etc/opscenter/event-plugins/email.conf
2014-05-14 05:53:46+0000 [Connect2me] INFO: Done loading event plugins
2014-05-14 05:53:46+0000 [] INFO: Metric caching enabled with 50 points and 1000 metrics cached
2014-05-14 05:53:46+0000 [] INFO: Starting PushService
2014-05-14 05:53:46+0000 [Connect2me] INFO: Starting CassandraCluster service
2014-05-14 05:53:46+0000 [Connect2me] INFO: agent_config items: {'cassandra_log_location': '/var/log/cassandra/system.log', 'thrift_port': 9160, 'thrift_ssl_truststore': None, 'rollups300_ttl': 2419200, 'rollups86400_ttl': -1, 'jmx_port': 7199, 'metrics_ignored_solr_cores': '', 'api_port': '61621', 'metrics_enabled': 1, 'thrift_ssl_truststore_type': 'JKS', 'kerberos_use_ticket_cache': True, 'use_ssl': 1, 'kerberos_renew_tgt': True, 'rollups60_ttl': 604800, 'cassandra_install_location': '', 'rollups7200_ttl': 31536000, 'kerberos_debug': False, 'storage_keyspace': 'OpsCenter', 'ec2_metadata_api_host': '169.254.169.254', 'provisioning': 0, 'kerberos_use_keytab': True, 'metrics_ignored_column_families': '', 'thrift_ssl_truststore_password': None, 'metrics_ignored_keyspaces': 'system, system_traces, system_auth, dse_auth, OpsCenter'}
2014-05-14 05:53:46+0000 [] INFO: Stopping factory <opscenterd.ThriftService.NoReconnectCassandraClientFactory instance at 0x4076c68>
2014-05-14 05:53:47+0000 [Connect2me] INFO: Enterprise functionality: True
2014-05-14 05:53:48+0000 [Connect2me] INFO: Snitch: com.datastax.bdp.snitch.DseDelegateSnitch
2014-05-14 05:53:48+0000 [Connect2me] INFO: Cluster Name: Connect2me
2014-05-14 05:53:48+0000 [Connect2me] INFO: Partitioner: org.apache.cassandra.dht.RandomPartitioner
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.214.1.100 ('128010234515697016761586673489854425713')
2014-05-14 05:53:50+0000 [Connect2me] INFO: Node 54.214.1.100 has multiple tokens (vnodes). Only one picked for display.
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.214.1.110 ('74547314523494862953006764525852718268')
2014-05-14 05:53:50+0000 [Connect2me] INFO: Node 54.214.1.110 has multiple tokens (vnodes). Only one picked for display.
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.243.203.229 ('95676355653121167189122638977297238333')
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.214.1.78 ('165496574052081051366176941207447197429')
2014-05-14 05:53:50+0000 [Connect2me] INFO: Recognizing new node 54.243.201.237 ('164812453774030768973707069212224107713')
2014-05-14 05:53:56+0000 [Connect2me] INFO: Keyspaces: {'dse_security': CassandraKeyspace(name=dse_security, column_families=['tokens'], tables=[u'tokens'], attributes={'strategy_options': {'us-east': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'solr_admin': CassandraKeyspace(name=solr_admin, column_families=[], tables=[u'solr_resources'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.EverywhereStrategy'}), 'mykeyspace1': CassandraKeyspace(name=mykeyspace1, column_families=[], tables=[u'mysolr1', u'videos', u'lyrics', u'song'], attributes={'strategy_options': {'us-west-2': '3', 'us-east': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'system': CassandraKeyspace(name=system, column_families=['IndexInfo', 'NodeIdInfo', 'schema_keyspaces', 'hints'], tables=[u'peers', u'range_xfers', u'schema_keyspaces', u'schema_columns', u'IndexInfo', u'schema_triggers', u'sstable_activity', u'peer_events', u'paxos', u'batchlog', u'NodeIdInfo', u'compaction_history', u'compactions_in_progress', u'schema_columnfamilies', u'local', u'hints'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.LocalStrategy'}), 'cfs_archive': CassandraKeyspace(name=cfs_archive, column_families=['rules', 'sblocks', 'cleanup', 'inode'], tables=[u'rules', u'sblocks', u'cleanup', u'inode'], attributes={'strategy_options': {'us-east': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'OpsCenter': CassandraKeyspace(name=OpsCenter, column_families=['events_timeline', 'settings', 'rollups60', 'rollups86400', 'pdps', 'rollups7200', 'events', 'rollups300'], tables=[u'events_timeline', u'settings', u'rollups60', u'rollups86400', u'pdps', u'rollups7200', u'events', u'rollups300'], attributes={'strategy_options': {'replication_factor': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'system_traces': CassandraKeyspace(name=system_traces, column_families=[], tables=[u'events', u'sessions'], attributes={'strategy_options': {'replication_factor': '2'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'HiveMetaStore': CassandraKeyspace(name=HiveMetaStore, column_families=['MetaStore'], tables=[u'MetaStore'], attributes={'strategy_options': {'replication_factor': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.SimpleStrategy'}), 'cfs': CassandraKeyspace(name=cfs, column_families=['rules', 'sblocks', 'cleanup', 'inode'], tables=[u'rules', u'sblocks', u'cleanup', u'inode'], attributes={'strategy_options': {'us-east': '1'}, 'replica_placement_strategy': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}), 'dse_system': CassandraKeyspace(name=dse_system, column_families=[], tables=[u'job_trackers'], attributes={'strategy_options': {}, 'replica_placement_strategy': 'org.apache.cassandra.locator.EverywhereStrategy'})}
2014-05-14 05:54:06+0000 [] WARN: ProcessingError while calling CreateClusterConfController: Timeout while adding cluster. Please check the log for details on the problem.
2014-05-14 05:54:54+0000 [Connect2me] INFO: Initializing event storage.
2014-05-14 05:54:54+0000 [Connect2me] INFO: SSL agent communication is enabled. Automatic agent detection will be turned off.
2014-05-14 05:54:54+0000 [Connect2me] INFO: Attempting to load all persisted alert rules
2014-05-14 05:54:55+0000 [Connect2me] INFO: Done initializing event storage.
2014-05-14 05:54:55+0000 [Connect2me] INFO: Done loading persisted scheduled job descriptions
2014-05-14 05:54:55+0000 [Connect2me] INFO: Done loading persisted alert rules
2014-05-14 05:54:55+0000 [Connect2me] INFO: OpsCenter starting up.
这是datastax-agent / agent.log
INFO [qtp30763405-24] 2014-05-14 05:54:35,911 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-24] 2014-05-14 05::54:35,921 HTTP: :get /cluster/topology {:node_ip "54.214.1.100"} - 200
INFO [qtp30763405-21] 2014-05-14 05::54:35,934 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-21] 2014-05-14 05:54:35,945 HTTP: :get /cluster/topology {:node_ip "54.214.1.110"} - 200
INFO [qtp30763405-19] 2014-05-14 05:54:35,952 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-22] 2014-05-14 05:54:35,957 New JMX connection (127.0.0.1:7199)
INFO [qtp30763405-19] 2014-05-14 05:54:35,960 HTTP: :get /cluster/topology {:node_ip "54.243.203.229"} - 200
INFO [qtp30763405-22] 2014-05-14 05:54:35,972 HTTP: :get /cluster/topology {:node_ip "54.243.201.237"} - 200
我在两个日志中都看不到任何错误,但仍然认为节点正在连接,但我仍然遇到超时错误
2014-05-14 05:54:06+0000 [] WARN: ProcessingError while calling CreateClusterConfController: Timeout while adding cluster. Please check the log for details on the problem.
任何人都可以帮助我。
答案 0 :(得分:4)
有几件事可能导致超时,其中一些是Cassandra中的错误,其中一些可以优化OpsCenter端。您可以通过在/ etc / opscenter / clusters /中手动创建集群配置文件并重新启动opscenterd来解决此问题。例如,将以下内容写入mycluster.conf:
[cassandra]
seed_hosts = 1.2.3.4, 2.3.4.5
对于该群集正常工作可能仍需要约1分钟,但这将绕过超时检查。
答案 1 :(得分:0)
自opscenter与群集节点分离以来最可能出现的问题是防火墙问题(安全组)
查看此处列出的端口http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/sec/secConfFirePort.html,并确保您可以从opscenterd telnet到群集节点,然后再返回相关端口。
您提到的错误会导致错误,堆栈跟踪将包含ERROR:错误更新定义文件