下面是代码块并收到错误
> creating a temporary views
sqlcontext.sql("""CREATE TEMPORARY VIEW temp_pay_txn_stage
USING org.apache.spark.sql.cassandra
OPTIONS (
table "t_pay_txn_stage",
keyspace "ks_pay",
cluster "Test Cluster",
pushdown "true"
)""".stripMargin)
sqlcontext.sql("""CREATE TEMPORARY VIEW temp_pay_txn_source
USING org.apache.spark.sql.cassandra
OPTIONS (
table "t_pay_txn_source",
keyspace "ks_pay",
cluster "Test Cluster",
pushdown "true"
)""".stripMargin)
查询以下视图,以便能够从源中不存在的阶段获取新记录。
Scala> val df_newrecords = sqlcontext.sql("""Select UUID(),
| |stage.order_id,
| |stage.order_description,
| |stage.transaction_id,
| |stage.pre_transaction_freeze_balance,
| |stage.post_transaction_freeze_balance,
| |toTimestamp(now()),
| |NULL,
| |1 from temp_pay_txn_stage stage left join temp_pay_txn_source source on stage.order_id=source.order_id and stage.transaction_id=source.transaction_id where
| |source.order_id is null and source.transaction_id is null""")`
org.apache.spark.sql.AnalysisException: Undefined function: 'uuid()'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 7
我正在尝试生成UUID,但是收到此错误。
答案 0 :(得分:0)
这是一个简单示例如何生成timeuuid:
import org.apache.spark.sql.SQLContext
val sqlcontext = new SQLContext(sc)
import sqlcontext.implicits._
//Import UUIDs that contains the method timeBased()
import com.datastax.driver.core.utils.UUIDs
//user define function timeUUID which will retrun time based uuid
val timeUUID = udf(() => UUIDs.timeBased().toString)
//sample query to test, you can change it to yours
val df_newrecords = sqlcontext.sql("SELECT 1 as data UNION SELECT 2 as data").withColumn("time_uuid", timeUUID())
//print all the rows
df_newrecords.collect().foreach(println)
输出:
[1,9a81b3c0-170b-11e7-98bf-9bb55f3128dd]
[2,9a831350-170b-11e7-98bf-9bb55f3128dd]
来源:https://stackoverflow.com/a/37232099/2320144 https://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/utils/UUIDs.html#timeBased--