我正在尝试使用pentaho将文件加载到配置单元中。This is my job.我可以使用初始执行来加载它。这是我的日志。
2018/08/20 03:53:53 - hiveloadexternal - Start of job execution
2018/08/20 03:53:53 - Carte - Installing timer to purge stale objects after 1440 minutes.
2018/08/20 03:53:53 - hiveloadexternal - exec(0, 0, START.0)
2018/08/20 03:53:53 - START - Starting job entry
2018/08/20 03:53:53 - hiveloadexternal - Starting entry [Hadoop Copy Files]
2018/08/20 03:53:53 - hiveloadexternal - exec(1, 0, Hadoop Copy Files.0)
2018/08/20 03:53:53 - Hadoop Copy Files - Starting job entry
2018/08/20 03:53:53 - Hadoop Copy Files - Starting ...
2018/08/20 03:53:56 - Hadoop Copy Files - Processing row source File/folder source : [file:///home/hauser/PENTAHO/SRC] ... destination file/folder : [hdfs://nn82.cluster.com:8020/apps/hive/warehouse/nantest]... wildcard : [.*\.txt]
2018/08/20 03:53:56 - Hadoop Copy Files -
2018/08/20 03:53:56 - Hadoop Copy Files - Fetching : [file:///home/hauser/PENTAHO/SRC]
2018/08/20 03:53:56 - Hadoop Copy Files - ------
2018/08/20 03:53:56 - Hadoop Copy Files - File [file:///home/hauser/PENTAHO/SRC/HiveTestSampleTest2.txt] was copied to [hdfs://nn82.cluster.com:8020/apps/hive/warehouse/nantest/HiveTestSampleTest2.txt]
2018/08/20 03:53:57 - hiveloadexternal - Finished job entry [Hadoop Copy Files] (result=[true])
2018/08/20 03:53:57 - hiveloadexternal - Job execution finished
2018/08/20 03:53:57 - Kitchen - Finished!
2018/08/20 03:53:57 - Kitchen - Start=2018/08/20 03:53:20.208, Stop=2018/08/20 03:53:57.059
2018/08/20 03:53:57 - Kitchen - Processing ended after 36 seconds.
但是在第二次执行中,相同的ktr给我以下错误。
2018/08/20 03:58:51 - hiveloadexternal - Start of job execution
2018/08/20 03:58:51 - hiveloadexternal - exec(0, 0, START.0)
2018/08/20 03:58:51 - START - Starting job entry
2018/08/20 03:58:51 - hiveloadexternal - Starting entry [Hadoop Copy Files]
2018/08/20 03:58:51 - hiveloadexternal - exec(1, 0, Hadoop Copy Files.0)
2018/08/20 03:58:51 - Hadoop Copy Files - Starting job entry
2018/08/20 03:58:51 - Hadoop Copy Files - Starting ...
Aug 20, 2018 3:58:51 AM org.apache.cxf.endpoint.ServerImpl initDestination
INFO: Setting the server's publish address to be /i18n
2018/08/20 03:58:51 - Hadoop Copy Files - Processing row source File/folder source : [file:///home/hauser/PENTAHO/SRC] ... destination file/folder : [file:///home/hauser/PENTAHO/data-integration/hdfs:/nn82.cluster.com:8020/apps/hive/warehouse/nantest]... wildcard : [.*\.txt]
2018/08/20 03:58:51 - Hadoop Copy Files - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : Folder file:///home/hauser/PENTAHO/data-integration/hdfs:/nn82.cluster.com:8020/apps/hive/warehouse/nantest does not exist !
2018/08/20 03:58:51 - Hadoop Copy Files - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : Destination folder does not exist!
2018/08/20 03:58:51 - hiveloadexternal - Finished job entry [Hadoop Copy Files] (result=[false])
2018/08/20 03:58:51 - hiveloadexternal - Job execution finished
2018/08/20 03:58:51 - Kitchen - Finished!
2018/08/20 03:58:51 - Kitchen - ERROR (version 8.1.0.0-365, build 8.1.0.0-365 from 2018-04-30 09.42.24 by buildguy) : Finished with errors
2018/08/20 03:58:51 - Kitchen - Start=2018/08/20 03:58:42.099, Stop=2018/08/20 03:58:51.586
2018/08/20 03:58:51 - Kitchen - Processing ended after 9 seconds.
ie;在初次运行时,目标目录是正确的(hdfs)。但是在第二次运行时,目标目录是错误的。为什么会这样?
注意:我正在使用命令行运行我的工作。