我似乎无法弄清楚为什么unbase64函数在我的Spark SQL查询中不起作用。
这是一个例子。我试图解码" VGhpcyBpcyBhIHRlc3Qh"通过调用spark SQL中的unbase64函数。关于为什么输出没有被解码的任何想法?感谢。
from pyspark import SparkContext
from pyspark.sql import SQLContext
from pyspark.sql.functions import unbase64
sc = SparkContext("local", "Simple App")
sqlContext = SQLContext(sc)
log = [{"eventTime":"2015-12-14 15:27:00","id":"9ab0135f-b8a3-4312-9065-9f8874fd790c","fullLog":"VGhpcyBpcyBhIHRlc3Qh"}]
df = sqlContext.createDataFrame(log)
df.registerTempTable('data')
query = sqlContext.sql('SELECT unbase64(fullLog) as test FROM data')
query.write.save("output", format="json")
我想要的输出是:{"test":"VGhpcyBpcyBhIHRlc3Qh"}
:{"test":"This is a test!"}
答案 0 :(得分:0)
这似乎对我有用......
from pyspark.sql import HiveContext
from pyspark.sql import SQLContext
log = [("2015-12-14 15:27:00","9ab0135f-b8a3-4312-9065-9f8874fd790c","VGhpcyBpcyBhIHRlc3Qh")]
rdd_log = sc.parallelize(log)
df = sqlContext.createDataFrame(rdd_log, ["eventTime", "id", "fullLog"])
df.registerTempTable("data")
query = sqlContext.sql('SELECT unbase64(fullLog) as test FROM data')
query = query.select(query.test.cast("string").alias('test'))
print query.collect()
>> [Row(test=u'This is a test!')]