接下来要注意的一件事是:Cassandra正在10.0.0.60
而不是localhost
上运行。我不确定是否必须告诉pyspark。
使用相同的技术,我学会了on stackoverflow
我将cassandra pyspark连接器安装为susggested here
一切似乎都没事。
[idf@node1 bin]$ spark-shell --packages TargetHolding:pyspark-cassandra:0.3.5
Ivy Default Cache set to: /home/idf/.ivy2/cache
The jars for the packages stored in: /home/idf/.ivy2/jars
:: loading settings :: url = jar:file:/opt/spark-1.6.1-bin-hadoop2.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
TargetHolding#pyspark-cassandra added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found TargetHolding#pyspark-cassandra;0.3.5 in spark-packages
found com.datastax.spark#spark-cassandra-connector-java_2.10;1.6.0-M1 in central
found com.datastax.spark#spark-cassandra-connector_2.10;1.6.0-M1 in central
found org.apache.cassandra#cassandra-clientutil;3.0.2 in central
found com.datastax.cassandra#cassandra-driver-core;3.0.0 in central
found io.netty#netty-handler;4.0.33.Final in central
found io.netty#netty-buffer;4.0.33.Final in central
found io.netty#netty-common;4.0.33.Final in central
found io.netty#netty-transport;4.0.33.Final in central
found io.netty#netty-codec;4.0.33.Final in central
found io.dropwizard.metrics#metrics-core;3.1.2 in list
found org.slf4j#slf4j-api;1.7.7 in central
found org.apache.commons#commons-lang3;3.3.2 in list
found com.google.guava#guava;16.0.1 in central
found org.joda#joda-convert;1.2 in central
found joda-time#joda-time;2.3 in central
found com.twitter#jsr166e;1.1.0 in central
found org.scala-lang#scala-reflect;2.10.5 in list
:: resolution report :: resolve 2048ms :: artifacts dl 15ms
:: modules in use:
TargetHolding#pyspark-cassandra;0.3.5 from spark-packages in [default]
com.datastax.cassandra#cassandra-driver-core;3.0.0 from central in [default]
com.datastax.spark#spark-cassandra-connector-java_2.10;1.6.0-M1 from central in [default]
com.datastax.spark#spark-cassandra-connector_2.10;1.6.0-M1 from central in [default]
com.google.guava#guava;16.0.1 from central in [default]
com.twitter#jsr166e;1.1.0 from central in [default]
io.dropwizard.metrics#metrics-core;3.1.2 from list in [default]
io.netty#netty-buffer;4.0.33.Final from central in [default]
io.netty#netty-codec;4.0.33.Final from central in [default]
io.netty#netty-common;4.0.33.Final from central in [default]
io.netty#netty-handler;4.0.33.Final from central in [default]
io.netty#netty-transport;4.0.33.Final from central in [default]
joda-time#joda-time;2.3 from central in [default]
org.apache.cassandra#cassandra-clientutil;3.0.2 from central in [default]
org.apache.commons#commons-lang3;3.3.2 from list in [default]
org.joda#joda-convert;1.2 from central in [default]
org.scala-lang#scala-reflect;2.10.5 from list in [default]
org.slf4j#slf4j-api;1.7.7 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 18 | 5 | 5 | 0 || 18 | 0 |
---------------------------------------------------------------------
:: problems summary ::
:::: ERRORS
unknown resolver null
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 18 already retrieved (0kB/13ms)
16/04/08 16:21:59 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.1
/_/
Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
16/04/08 16:22:06 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/spark-latest/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/spark-1.6.1-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar."
16/04/08 16:22:06 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/spark-latest/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/spark-1.6.1-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar."
16/04/08 16:22:06 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/spark-1.6.1-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar."
16/04/08 16:22:23 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/04/08 16:22:23 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/04/08 16:22:27 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/spark-latest/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/spark-1.6.1-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar."
16/04/08 16:22:27 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/spark-latest/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/spark-1.6.1-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar."
16/04/08 16:22:27 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/spark-1.6.1-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar."
SQL context available as sqlContext.
scala>
但是当我启动pyspark并尝试使用它时,我收到错误。 在我的设置中不确定我做错了什么?
[idf@node1 bin]$ pyspark
Python 2.7.11 |Anaconda 4.0.0 (64-bit)| (default, Dec 6 2015, 18:08:32)
Type "copyright", "credits" or "license" for more information.
IPython 4.1.2 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
16/04/08 16:24:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 1.6.1
/_/
Using Python version 2.7.11 (default, Dec 6 2015 18:08:32)
SparkContext available as sc, HiveContext available as sqlContext.
In [1]: import pyspark_cassandra
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-c1f2f694c450> in <module>()
----> 1 import pyspark_cassandra
ImportError: No module named pyspark_cassandra
In [2]:
编辑1
我将PYTHONPATH添加到我的.bashrc,尽管Continuum Analytics recommends反对它:
Conda works best when these environment variables are not set, as their typical use-cases are obviated by Conda environments, and a common issue is that they will cause Python to pick up the wrong versions or broken versions of a library.
我按照here所述添加参数时更接近但仍不确定错误:
[idf@node1 ~]$ pyspark --packages com.datastax.spark:spark-cassandra-connector_2.10:1.4.0
Python 2.7.11 |Anaconda 4.0.0 (64-bit)| (default, Dec 6 2015, 18:08:32)
Type "copyright", "credits" or "license" for more information.
IPython 4.1.2 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
Ivy Default Cache set to: /home/idf/.ivy2/cache
The jars for the packages stored in: /home/idf/.ivy2/jars
:: loading settings :: url = jar:file:/opt/spark-1.6.1-bin-hadoop2.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.datastax.spark#spark-cassandra-connector_2.10 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found com.datastax.spark#spark-cassandra-connector_2.10;1.4.0 in central
found org.apache.cassandra#cassandra-clientutil;2.1.5 in central
found com.datastax.cassandra#cassandra-driver-core;2.1.5 in central
found io.netty#netty;3.9.0.Final in central
found com.codahale.metrics#metrics-core;3.0.2 in central
found org.slf4j#slf4j-api;1.7.5 in central
found org.apache.commons#commons-lang3;3.3.2 in list
found com.google.guava#guava;14.0.1 in list
found org.joda#joda-convert;1.2 in central
found joda-time#joda-time;2.3 in central
found com.twitter#jsr166e;1.1.0 in central
found org.scala-lang#scala-reflect;2.10.5 in list
downloading https://repo1.maven.org/maven2/com/datastax/spark/spark-cassandra-connector_2.10/1.4.0/spark-cassandra-connector_2.10-1.4.0.jar ...
[SUCCESSFUL ] com.datastax.spark#spark-cassandra-connector_2.10;1.4.0!spark-cassandra-connector_2.10.jar (1926ms)
downloading https://repo1.maven.org/maven2/org/apache/cassandra/cassandra-clientutil/2.1.5/cassandra-clientutil-2.1.5.jar ...
[SUCCESSFUL ] org.apache.cassandra#cassandra-clientutil;2.1.5!cassandra-clientutil.jar (78ms)
downloading https://repo1.maven.org/maven2/com/datastax/cassandra/cassandra-driver-core/2.1.5/cassandra-driver-core-2.1.5.jar ...
[SUCCESSFUL ] com.datastax.cassandra#cassandra-driver-core;2.1.5!cassandra-driver-core.jar(bundle) (633ms)
downloading https://repo1.maven.org/maven2/io/netty/netty/3.9.0.Final/netty-3.9.0.Final.jar ...
[SUCCESSFUL ] io.netty#netty;3.9.0.Final!netty.jar(bundle) (1066ms)
downloading https://repo1.maven.org/maven2/com/codahale/metrics/metrics-core/3.0.2/metrics-core-3.0.2.jar ...
[SUCCESSFUL ] com.codahale.metrics#metrics-core;3.0.2!metrics-core.jar(bundle) (94ms)
downloading https://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.5/slf4j-api-1.7.5.jar ...
[SUCCESSFUL ] org.slf4j#slf4j-api;1.7.5!slf4j-api.jar (33ms)
:: resolution report :: resolve 2920ms :: artifacts dl 3881ms
:: modules in use:
com.codahale.metrics#metrics-core;3.0.2 from central in [default]
com.datastax.cassandra#cassandra-driver-core;2.1.5 from central in [default]
com.datastax.spark#spark-cassandra-connector_2.10;1.4.0 from central in [default]
com.google.guava#guava;14.0.1 from list in [default]
com.twitter#jsr166e;1.1.0 from central in [default]
io.netty#netty;3.9.0.Final from central in [default]
joda-time#joda-time;2.3 from central in [default]
org.apache.cassandra#cassandra-clientutil;2.1.5 from central in [default]
org.apache.commons#commons-lang3;3.3.2 from list in [default]
org.joda#joda-convert;1.2 from central in [default]
org.scala-lang#scala-reflect;2.10.5 from list in [default]
org.slf4j#slf4j-api;1.7.5 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 12 | 6 | 6 | 0 || 12 | 6 |
---------------------------------------------------------------------
:: problems summary ::
:::: ERRORS
unknown resolver null
unknown resolver null
unknown resolver sbt-chain
unknown resolver null
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
7 artifacts copied, 5 already retrieved (6423kB/71ms)
16/04/08 20:56:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 1.6.1
/_/
Using Python version 2.7.11 (default, Dec 6 2015 18:08:32)
SparkContext available as sc, HiveContext available as sqlContext.
In [1]: sqlContext.read\
.format("org.apache.spark.sql.cassandra")\
.options(table="timeseries", keyspace="tickdata")\
.load().show()
java.sql.SQLException: Failed to start database 'metastore_db' with class loader org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@543c5d8d, see the next exception for details.
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
**A HUGE MORE stack trace then**
You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt assembly
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<ipython-input-2-b5f777c1abc2> in <module>()
----> 1 sqlContext.read .format("org.apache.spark.sql.cassandra") .options(table="timeseries", keyspace="tickdata") .load().show()
/opt/spark-latest/python/pyspark/sql/context.pyc in read(self)
658 :return: :class:`DataFrameReader`
659 """
--> 660 return DataFrameReader(self)
661
662
and lots more call stack of error...
编辑2
我几乎那里!
我从
开始[idf@node1 ~]$ sudo spark-shell --packages TargetHolding:pyspark-cassandra:0.3.5
然后按照here按照scala进行了一次
scala> sc.stop
scala> import com.datastax.spark.connector._, org.apache.spark.SparkContext, org.apache.spark.SparkContext._, org.apache.spark.SparkConf
import com.datastax.spark.connector._
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
scala> val conf = new SparkConf(true).set("spark.cassandra.connection.host", "10.0.0.60")
conf: org.apache.spark.SparkConf = org.apache.spark.SparkConf@2e95d163
scala> val sc = new SparkContext(conf)
sc: org.apache.spark.SparkContext = org.apache.spark.SparkContext@6d8d78a
scala> val test_spark_rdd = sc.cassandraTable("tickdate", "timeseries")
test_spark_rdd: com.datastax.spark.connector.rdd.CassandraTableScanRDD[com.datastax.spark.connector.CassandraRow] = CassandraTableScanRDD[0] at RDD at CassandraRDD.scala:15
scala> case class Quote(firm : Int, symbol : Int, curve : Int, quote_id : Int, time : String, bid : Double, ask : Double)
defined class Quote
scala> sc.cassandraTable[Quote]("sparkdemo", "quotes")
res2: com.datastax.spark.connector.rdd.CassandraTableScanRDD[Quote] = CassandraTableScanRDD[2] at RDD at CassandraRDD.scala:15
scala>
看起来不错(我认为LOL)接下来我必须看看我是否连接到cassandra,如果我可以提取一些行。
最后,我想从pyspark开始工作,但我怀疑它是一样的。
编辑3
然后我尝试了
scala> test_spark_rdd.count
16/04/09 01:37:22 ERROR DataSizeEstimates: Failed to fetch size estimates for tickdata.timeseries from system.size_estimates table. The number of created Spark partitions may be inaccurate. Please make sure you use Cassandra 2.1.5 or newer.
com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured columnfamily size_estimates
at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:50)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:63)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy22.execute(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:33)
at com.sun.proxy.$Proxy22.execute(Unknown Source)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:40)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates$$anonfun$tokenRanges$1.apply(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:109)
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:139)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:109)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges$lzycompute(DataSizeEstimates.scala:38)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.tokenRanges(DataSizeEstimates.scala:37)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes$lzycompute(DataSizeEstimates.scala:88)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.totalDataSizeInBytes(DataSizeEstimates.scala:87)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.dataSizeInBytes$lzycompute(DataSizeEstimates.scala:81)
at com.datastax.spark.connector.rdd.partitioner.DataSizeEstimates.dataSizeInBytes(DataSizeEstimates.scala:80)
at com.datastax.spark.connector.rdd.partitioner.CassandraRDDPartitioner.<init>(CassandraRDDPartitioner.scala:41)
at com.datastax.spark.connector.rdd.partitioner.CassandraRDDPartitioner$.apply(CassandraRDDPartitioner.scala:180)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.getPartitions(CassandraTableScanRDD.scala:144)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
at org.apache.spark.rdd.RDD.count(RDD.scala:1157)
at $line47.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40)
at $line47.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:45)
at $line47.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:47)
at $line47.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:49)
at $line47.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:51)
at $line47.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:53)
at $line47.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:55)
at $line47.$read$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:57)
at $line47.$read$$iwC$$iwC$$iwC$$iwC.<init>(<console>:59)
at $line47.$read$$iwC$$iwC$$iwC.<init>(<console>:61)
at $line47.$read$$iwC$$iwC.<init>(<console>:63)
at $line47.$read$$iwC.<init>(<console>:65)
at $line47.$read.<init>(<console>:67)
at $line47.$read$.<init>(<console>:71)
at $line47.$read$.<clinit>(<console>)
at $line47.$eval$.<init>(<console>:7)
at $line47.$eval$.<clinit>(<console>)
at $line47.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: unconfigured columnfamily size_estimates
at com.datastax.driver.core.Responses$Error.asException(Responses.java:136)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:184)
at com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:43)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:798)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:617)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1005)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:928)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:831)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:346)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
res3: Long = 26849
scala>