我正在尝试使用Spark Framework连接到我的本地数据库Cassandra。它一直有效,直到我在表格中添加了大量行。现在它什么也没显示。怎么了?我相信代码是正确的,它需要一些cassandra.yaml配置?
trait Settings {
val CassandraHost = "127.0.0.1"
val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", CassandraHost)
.set("spark.cassandra.auth.username", "cassandra")
.set("spark.cassandra.auth.password", "cassandra")
.set("spark.cleaner.ttl", "3600")
.setMaster("local[12]")
.setAppName(getClass.getSimpleName)
lazy val sc = new SparkContext(conf)
}
object Settings {
def apply(): Settings = new Settings {}
}
object DataMiner extends App with Settings {
val cc = new CassandraSQLContext(sc)
cc.setKeyspace("cxp_logs")
val df = cc.cassandraSql("SELECT * FROM contact_task_status LIMIT 100")
df.collect.foreach(println)
df.show(1000)
}
记录:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/07/30 14:36:18 INFO SparkContext: Running Spark version 1.4.0
15/07/30 14:36:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/07/30 14:36:19 INFO SecurityManager: Changing view acls to: User2
15/07/30 14:36:19 INFO SecurityManager: Changing modify acls to: User2
15/07/30 14:36:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(User2); users with modify permissions: Set(User2)
15/07/30 14:36:19 INFO Slf4jLogger: Slf4jLogger started
15/07/30 14:36:19 INFO Remoting: Starting remoting
15/07/30 14:36:19 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@10.254.10.132:51434]
15/07/30 14:36:19 INFO Utils: Successfully started service 'sparkDriver' on port 51434.
15/07/30 14:36:19 INFO SparkEnv: Registering MapOutputTracker
15/07/30 14:36:19 INFO SparkEnv: Registering BlockManagerMaster
15/07/30 14:36:19 INFO DiskBlockManager: Created local directory at C:\Users\User2\AppData\Local\Temp\spark-9ee9ee13-7fcd-4e64-947a-83b9311b62f0\blockmgr-8c25eeeb-7a64-4162-961d-bb94b9dd7bf7
15/07/30 14:36:19 INFO MemoryStore: MemoryStore started with capacity 1927.8 MB
15/07/30 14:36:19 INFO HttpFileServer: HTTP File server directory is C:\Users\User2\AppData\Local\Temp\spark-9ee9ee13-7fcd-4e64-947a-83b9311b62f0\httpd-ce428dc2-5b70-45a3-b761-9d728616ff73
15/07/30 14:36:19 INFO HttpServer: Starting HTTP Server
15/07/30 14:36:19 INFO Utils: Successfully started service 'HTTP file server' on port 51435.
15/07/30 14:36:19 INFO SparkEnv: Registering OutputCommitCoordinator
15/07/30 14:36:20 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/07/30 14:36:20 INFO SparkUI: Started SparkUI at http://10.254.10.132:4040
15/07/30 14:36:20 INFO Executor: Starting executor ID driver on host localhost
15/07/30 14:36:20 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51472.
15/07/30 14:36:20 INFO NettyBlockTransferService: Server created on 51472
15/07/30 14:36:20 INFO BlockManagerMaster: Trying to register BlockManager
15/07/30 14:36:20 INFO BlockManagerMasterEndpoint: Registering block manager localhost:51472 with 1927.8 MB RAM, BlockManagerId(driver, localhost, 51472)
15/07/30 14:36:20 INFO BlockManagerMaster: Registered BlockManager
15/07/30 14:36:22 INFO Cluster: New Cassandra host /127.0.0.1:9042 added
15/07/30 14:36:22 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
15/07/30 14:36:22 INFO CassandraStrategies$CassandraTableScans: projectList: Vector(day_id#14L, minute_id#15, dlg_id#16, msisdn#17, service_type#18, service_count#19, channel#20, destination#21, milliseconds_id#22, row_ts#23, second_id#24, service_id#25, service_kind#26, service_result#27)
15/07/30 14:36:22 INFO CassandraStrategies$CassandraTableScans: predicates: List()
15/07/30 14:36:22 INFO CassandraStrategies$CassandraTableScans: pushdown predicates: ArrayBuffer()
15/07/30 14:36:22 INFO CassandraStrategies$CassandraTableScans: remaining predicates: ArrayBuffer()
15/07/30 14:36:22 INFO CassandraTableScan: attributes : day_id,minute_id,dlg_id,msisdn,service_type,service_count,channel,destination,milliseconds_id,row_ts,second_id,service_id,service_kind,service_result
15/07/30 14:36:22 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
15/07/30 14:36:22 INFO Cluster: New Cassandra host /127.0.0.1:9042 added
15/07/30 14:36:22 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
15/07/30 14:36:23 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster