我的问题与Hive UDF有关, 我创建了一个将String date转换为julian日期的UDF,它在我执行select查询时工作正常,但在使用命令Create table temp as时抛出错误。
SELECT inlog.UserId, inlog.date, intime,incount,outtime,ISNULL(outcount,0)
FROM
(SELECT
ILog.UserId, CAST(ILog.LogDate as date) date , Min(ILog.LogDate) intime, Count(ILog.DeviceLogId) incount
from
DeviceLogs ILog
where
ILog.DeviceId in (25) -- in deviceIds
GROUP BY
ILog.UserId, CAST(ILog.LogDate as date)) inlog
LEFT OUTER join
(SELECT
OLog.UserId, CAST(OLog.LogDate as date) date , Max(OLog.LogDate) outtime, Count(OLog.DeviceLogId) outcount
from
DeviceLogs OLog
where
OLog.DeviceId in (26) -- out deviceIds
GROUP BY
OLog.UserId, CAST(OLog.LogDate as date)) outlog
ON inlog.UserId = outlog.UserId
执行选择查询:
Create function convertToJulian as 'com.convertToJulian'
Using jar 'hdfs:/user/hive/'.
OutPut:
SELECT name, date FROM custTable
WHERE name is not null and convertToJulian(date) < convertToJulia
(to_date(from_unixtime(unix_timestamp())));
它工作正常并向我提供我需要的确切数据。 现在在第二步中,我想将这些数据添加到另一个新表中,所以我添加了
converting to local hdfs:/user/hive/udf.jar
Added [/usr/local/hivetmp/amit.pathak/9381feb3-6c5f-469b-b6b1-
9af55abbdabd/udf.jar] to class path
Added resources: [hdfs:/user/hive/udf.jar]
输出:
CREATE Table trop
As
SELECT name, date FROM custTable
WHERE name is not null and convertToJulian(date) <
convertToJulian (to_date(from_unixtime(unix_timestamp())));
我无法找出从hdfs位置获取数据的原因 HDFS://本地主机:54310的/ usr /本地/ hivetmp / amit.pathak / 9381feb3-6c5f-469B-b6b1-9af55abbdabd / udf.jar
我还尝试了几种方法,比如在hdfs中手动添加数据。 但是hive生成随机会话ID,它创建具有相同会话ID名称的文件夹名称。