我在服务器上安装了DSE 4.6。我创建了一个存储库和一个表。我还在表格中插入了一些数据。我试图使用Spark SQL读取数据,但我得到了这个例外:
Exception in thread "main" java.lang.RuntimeException: Table Not Found: test.table3
这是我的Java代码:
import com.datastax.bdp.spark.DseSparkConfHelper;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.api.java.JavaSQLContext;
import org.apache.spark.sql.api.java.JavaSchemaRDD;
public class Main {
public static void main(String[] args) {
SparkConf conf = DseSparkConfHelper.enrichSparkConf(new SparkConf()).setAppName("Simple App").setMaster("local");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaSQLContext sqlContext = new JavaSQLContext(sc);
JavaSchemaRDD data = sqlContext.sql("SELECT * FROM test.table3");
System.out.println(data.count());
}
}
我正在使用Maven来管理依赖项。这是我的pom.xml:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>cassandra-data-retriever</artifactId>
<version>1.0</version>
<repositories>
<repository>
<id>project.local</id>
<name>project</name>
<url>file:${project.basedir}/projectrepo</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.1.0</version>
</dependency>
<dependency>
<groupId>com.datastax</groupId>
<artifactId>bdp</artifactId>
<version>4.6.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.2</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
编辑:该表确实存在。这是cqlsh的输出:
cqlsh> DESCRIBE TABLE test.table3
CREATE TABLE table3 (
id int,
birth_date timestamp,
name text,
PRIMARY KEY ((id))
) WITH
bloom_filter_fp_chance=0.010000 AND
caching='KEYS_ONLY' AND
comment='' AND
dclocal_read_repair_chance=0.100000 AND
gc_grace_seconds=864000 AND
index_interval=128 AND
read_repair_chance=0.000000 AND
replicate_on_write='true' AND
populate_io_cache_on_flush='false' AND
default_time_to_live=0 AND
speculative_retry='99.0PERCENTILE' AND
memtable_flush_period_in_ms=0 AND
compaction={'class': 'SizeTieredCompactionStrategy'} AND
compression={'sstable_compression': 'LZ4Compressor'};
EDIT2:我也可以使用cqlsh读取数据:
cqlsh> SELECT * FROM test.table3;
id | birth_date | name
----+--------------------------+-------
1 | 1970-01-15 07:48:54+0100 | Peter
(1 rows)