使用Spark SQL的SELECT语句中的cassandra uuid

时间:2015-11-18 11:32:14

标签: apache-spark cassandra

我在cassandra(v.2.2.3)

中有下表
cqlsh> DESCRIBE TABLE historian.timelines;

CREATE TABLE historian.timelines (
    assetid uuid,
    tslice int,
    ...
    value map<text, text>,
    PRIMARY KEY ((assetid, tslice), ...)
) WITH CLUSTERING ORDER BY (deviceid ASC, paramid ASC, fts DESC) 
...
    ;

我想通过Apache Spark(v.1.5.0)通过以下java片段(使用cassandra spark connector v.1.5.0和cassandra驱动核心v.2.2.0 RC3)提取数据:

// Initialize Spark SQL Context
CassandraSQLContext sqlContext = new CassandraSQLContext(jsc.sc());
sqlContext.setKeyspace(keyspace);
DataFrame df = sqlContext.sql("SELECT * FROM " + tableName + 
    " WHERE assetid = '085eb9c6-8a16-11e5-af63-feff819cdc9f' LIMIT 2");
df.show();

此时我在访问上述show方法时遇到以下错误:

cannot resolve '(assetid = cast(085eb9c6-8a16-11e5-af63-feff819cdc9f as double))' due to data type mismatch: 
differing types in '(assetid = cast(085eb9c6-8a16-11e5-af63-feff819cdc9f as double))' (uuid and double).;

因此,似乎Spark SQL没有将assetid输入解释为UUID。我可以做些什么来处理Spark SQL查询中的cassandra UUID类型?

谢谢!

1 个答案:

答案 0 :(得分:0)

确实你的查询参数是一个字符串而不是一个UUID,只需像这样转换查询参数:

import java.util.UUID;

DataFrame df = sqlContext.sql("SELECT * FROM " + tableName + 
" WHERE assetid = "+ UUID.fromString("085eb9c6-8a16-11e5-af63-feff819cdc9f") +" LIMIT 2");