Pentaho Data Integration动态连接(从数据库读取连接)

时间:2016-10-06 14:08:31

标签: pentaho pdi pentaho-data-integration

Pentaho数据集成:CE 6.1.0.1-196

我是Pentaho Data Integration的新手。 我需要在多个数据库中运行相同的查询。 我在master数据库中创建了一个表来存储需要查阅的其他数据库的连接信息。 在表格结构下面。

SQL> desc database_connection;
Name          Type          Nullable Default Comments 
------------- ------------- -------- ------- -------- 
DATABASE_NAME VARCHAR2(32)  Y                         
JDBC_URL      VARCHAR2(512) Y                         
USERNAME      VARCHAR2(32)  Y                         
PASSWORD      VARCHAR2(32)  Y
ENABLED       VARCHAR2(1)   Y   

样本数据

DATABASE_NAME: XPTO
JDBC_URL: (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = xptosrv.xyz.com)(PORT = 1521))(LOAD_BALANCE = ON)(FAILOVER = ON)(CONNECT_DATA = (SERVER = DEDICATED)(SERVICE_NAME = XPTO.XYZ.COM)(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 180)(DELAY = 5))))
USERNAME: SYSTEM
PASSWORD: blablabla
ENABLED: Y

我的.ktr文件:

(set_variables.ktr)

表输入--->将行复制到结果

与输入表关联的查询在master数据库中运行。

select database_name, jdbc_url, username, password from database_connection where enabled = 'Y'

(db_query.ktr)

表输入--->表输出

与表输入关联的查询运行o(多个数据库)并将数据存储在表输出(master数据库)中

我的.kjb文件:

(run_for_each_row.kjb)

开始--->转型--->成功

转换文件名:$ {Internal.Job.Filename.Directory} /db_query.ktr

作业属性参数:

DATABASE_NAME JDBC_URL 密码 USERNAME

(master_job.kjb)

开始--->转型--->每一行的工作--->成功

转换文件名:$ {Internal.Job.Filename.Directory} /set_variables.ktr

每行文件的作业:$ {Internal.Job.Filename.Directory} /run_for_each_row.kjb

每行的作业...高级选项卡 将以前的结果复制到参数 - >检查 对每个输入行执行 - >检查

每行的作业...参数:DATABASE_NAME,JDBC_URL,PASSWORD,USERNAME

执行日志:

2016/10/06 10:36:15 - Spoon - Iniciando o job...
2016/10/06 10:36:15 - master_job - Início da execução do job
2016/10/06 10:36:15 - master_job - Starting entry [Transformation]
2016/10/06 10:36:15 - Transformation - Loading transformation from XML file [file:///D:/pdi/set_variables.ktr]
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - set_variables - Expedindo in?cio para transforma??o [set_variables]
2016/10/06 10:36:15 - Table input.0 - Finished reading query, closing connection.
2016/10/06 10:36:15 - Copy rows to result.0 - Finished processing (I=0, O=0, R=6, W=6, U=0, E=0)
2016/10/06 10:36:15 - Table input.0 - Finished processing (I=6, O=0, R=0, W=6, U=0, E=0)
2016/10/06 10:36:15 - master_job - Starting entry [Job for each row]
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - slave_job - Starting entry [Transformation]
2016/10/06 10:36:15 - Transformation - Loading transformation from XML file [file:///D:/pdi/db_query.ktr]
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - cfgbuilder - Warning: The configuration parameter [org] is not supported by the default configuration builder for scheme: sftp
2016/10/06 10:36:15 - db_query - Expedindo in?cio para transforma??o [db_query]
2016/10/06 10:36:15 - Table input.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : An error occurred, processing will be stopped: 
2016/10/06 10:36:15 - Table input.0 - Error occurred while trying to connect to the database
2016/10/06 10:36:15 - Table input.0 - 
2016/10/06 10:36:15 - Table input.0 - Error connecting to database: (using class oracle.jdbc.driver.OracleDriver)
2016/10/06 10:36:15 - Table input.0 - Erro de ES: Connect identifier was empty.
2016/10/06 10:36:15 - Table input.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Erro inicializando step [Table input]
2016/10/06 10:36:15 - Table output.0 - Connected to database [REPORT] (commit=1000)
2016/10/06 10:36:15 - db_query - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Step [Table input.0] falhou durante inicializa??o!
2016/10/06 10:36:15 - Table input.0 - Finished reading query, closing connection.
2016/10/06 10:36:15 - Transformation - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Unable to prepare for execution of the transformation
2016/10/06 10:36:15 - Transformation - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : org.pentaho.di.core.exception.KettleException: 
2016/10/06 10:36:15 - Transformation - Falhou a inicializa??o de pelo menos um step. A Execu??o n?o pode sere iniciada!
2016/10/06 10:36:15 - Transformation - 
2016/10/06 10:36:15 - Transformation - 
2016/10/06 10:36:15 - Transformation -  at org.pentaho.di.trans.Trans.prepareExecution(Trans.java:1142)
2016/10/06 10:36:15 - Transformation -  at org.pentaho.di.trans.Trans.execute(Trans.java:612)
2016/10/06 10:36:15 - Transformation -  at org.pentaho.di.job.entries.trans.JobEntryTrans.execute(JobEntryTrans.java:1097)
2016/10/06 10:36:15 - Transformation -  at org.pentaho.di.job.Job.execute(Job.java:723)
2016/10/06 10:36:15 - Transformation -  at org.pentaho.di.job.Job.execute(Job.java:864)
2016/10/06 10:36:15 - Transformation -  at org.pentaho.di.job.Job.execute(Job.java:608)
2016/10/06 10:36:15 - Transformation -  at org.pentaho.di.job.entries.job.JobEntryJobRunner.run(JobEntryJobRunner.java:69)
2016/10/06 10:36:15 - Transformation -  at java.lang.Thread.run(Thread.java:745)
2016/10/06 10:36:15 - slave_job - Finished job entry [Transformation] (result=[false])
2016/10/06 10:36:15 - master_job - Finished job entry [Job for each row] (result=[false])
2016/10/06 10:36:15 - master_job - Finished job entry [Transformation] (result=[false])
2016/10/06 10:36:15 - master_job - Job execution finished
2016/10/06 10:36:15 - Spoon - O Job finalizou.

正在读取database_connection表中的数据

2016/10/06 10:36:15 - set_variables - Expedindo in?cio para transforma??o [set_variables]
2016/10/06 10:36:15 - Table input.0 - Finished reading query, closing connection.
2016/10/06 10:36:15 - Copy rows to result.0 - Finished processing (I=0, O=0, R=6, W=6, U=0, E=0)
2016/10/06 10:36:15 - Table input.0 - Finished processing (I=6, O=0, R=0, W=6, U=0, E=0)

但我不知道我做错了这些数据不作为参数传递。

我感谢任何帮助,因为我前几天已经停止了这个问题。

我在stackoverflow和pentaho论坛上找到的例子对我没什么帮助。

项目文件(https://github.com/scarlosantos/pdi

谢谢

2 个答案:

答案 0 :(得分:0)

使用Set Variables步骤而不是在" set_variables.ktr"中复制结果,并在连接属性中使用变量,它将在运行时替换这些变量,并且您将拥有动态数据库连接。

答案 1 :(得分:0)

FAQ Beginner Section中很好地解释了这个确切的用例。

简而言之:

0)检查所有驱动程序。

1)不要忘记在转换 s 和作业上指定这些变量的名称(right-click anywhere, Properties, Parameters)。而且它们也是在工作范围层面定义的。

2)重要:您转到视图(在左窗格中,您很可能在设计上),并共享连接,以便PDI知道您在任何转换/作业中的连接。 / p>

3)编辑连接,在HostName,DatabaseName,...框中,您可以编写${HOST}${DATABASE_NAME},...或您为变量指定的任何名称。如果您执行了步骤(1),只需按下Crtl-Space并从下拉菜单中选择。

4)用记事本编辑名为C:\Users\yourname\.kettle\shared.xml的文件。如果你保留最后一个工作版本的副本,它甚至会很有趣。而且,如果你足够勇敢,你甚至可以用PDI生成这个文件。

现在你正在提出一个有趣的问题:你似乎连接了jdbc-url,你可以在PDI中使用Generic Database Connection),但是使用该方法,PDI不知道哪个sql-dialect您正在使用。因此,如果您在流程中遇到一些有趣的错误,请确保SELECT *,请勿使用延迟转换,并查看Right-click/Output Fields的类型。