如何使用Java从数据库运行PDI转换?

时间:2015-10-06 05:21:55

标签: java database pentaho kettle pdi

我正在尝试从Java运行涉及数据库(任何数据库,但更优选没有SQL)的PDI转换。

我尝试过使用mongodb和cassandradb,但是我已经在这里问过了Running PDI Kettle on Java - Mongodb Step Missing Plugins,但是还没有人回复。

我也试过使用PostgreSQL切换到SQL DB,但它仍然无效。根据我的研究,我认为这是因为我没有彻底地从Java连接数据库,但我没有找到适合我的任何教程或方向。我已经尝试按照此博客中的说明进行操作:http://ameethpaatil.blogspot.co.id/2010/11/pentaho-data-integration-java-maven.html:但仍然遇到了一些有关存储库的问题(因为我没有任何问题,而且似乎有必要)。

当我从Spoon运行时,转换很好。它只是在我从Java运行时才失败。

有谁可以帮助我如何运行涉及数据库的PDI转换?我哪里出错了?

是否有人成功地从涉及noSQL和SQL数据库运行PDI转换?你用过什么数据库?

如果我问了太多问题,我很抱歉,我非常绝望。任何类型的信息将非常感激。谢谢。

4 个答案:

答案 0 :(得分:3)

从Java执行PDI作业非常简单。您只需要导入所有必需的jar文件(对于数据库),然后调用kettle类。最好的办法显然是使用" Maven"控制依赖。在maven pom.xml文件中,只需调用数据库驱动程序。

示例Maven文件如下所示,假设您使用的是pentaho v5.0.0GA和Database作为PostgreSQL:

<dependencies>
    <!-- Pentaho Kettle Core dependencies development -->
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle-core</artifactId>
        <version>5.0.0.1</version>
    </dependency>
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle-dbdialog</artifactId>
        <version>5.0.0.1</version>
    </dependency>
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle-engine</artifactId>
        <version>5.0.0.1</version>
    </dependency>
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle-ui-swt</artifactId>
        <version>5.0.0.1</version>
    </dependency>
    <dependency>
        <groupId>pentaho-kettle</groupId>
        <artifactId>kettle5-log4j-plugin</artifactId>
        <version>5.0.0.1</version>
    </dependency>

    <!-- The database dependency files. Use it if your kettle file involves database connectivity. -->
    <dependency>
        <groupId>postgresql</groupId>
        <artifactId>postgresql</artifactId>
        <version>9.1-902.jdbc4</version>
    </dependency>

您可以查看my blog for more。它适用于数据库连接。

希望这会有所帮助:)

答案 1 :(得分:1)

  • 我尝试使用&#34;转换而没有jndi&#34;并且有效!

但是我需要在我的pom.xml中添加这个存储库:

<repositories>
    <repository>
        <id>pentaho-releases</id>
        <url>http://repository.pentaho.org/artifactory/repo/</url>
    </repository>
</repositories>
  • 当我尝试使用数据源时,我遇到此错误:无法实例化类:org.osjava.sj.SimpleContextFactory [Root exception is java.lang.ClassNotFoundException:org.osjava.sj.SimpleContextFactory] ​​

在此完成记录: https://gist.github.com/eb15f8545e3382351e20.git

[FIX]:添加此依赖项:

<dependency>
    <groupId>pentaho</groupId>
    <artifactId>simple-jndi</artifactId>
    <version>1.0.1</version>
</dependency>
  • 之后会出现新错误:

    transformation_with_jndi - 为转换开始调度[transformation_with_jndi] 表输入0 - 错误(版本5.0.0.1.19046,由buildguy从2013-09-11_13-51-13构建1):发生错误,处理将停止: 表input.0 - 尝试连接数据库时发生错误 表input.0 - java.io.File参数必须是目录。 [d:\选择\工作空间蚀\调用-KTR-JNDI \简单JNDI]

完整日志:https://gist.github.com/jrichardsz/9d74c7263f3567ac4b45

[说明]这是由于

KettleEnvironment.init(); 

https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/blob/master/running-etl-transformation-using-java/researching-pentaho-classes/KettleEnvironment.java

有一种说服力:

        if (simpleJndi) {
          JndiUtil.initJNDI();
}

在JndiUtil:

String path = Const.JNDI_DIRECTORY;
if ((path == null) || (path.equals("")))

https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/blob/master/running-etl-transformation-using-java/researching-pentaho-classes/JndiUtil.java

在Const课程中:

public static String JNDI_DIRECTORY = NVL(System.getProperty("KETTLE_JNDI_ROOT"), System.getProperty("org.osjava.sj.root"));

https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/blob/master/running-etl-transformation-using-java/researching-pentaho-classes/Const.java

所以我们需要设置这个变量KETTLE_JNDI_ROOT

[FIX]您的示例中的一个小变化:只需添加此

即可
System.setProperty("KETTLE_JNDI_ROOT", jdbcPropertiesPath);

之前

KettleEnvironment.init();

基于您的代码的完整示例:

import java.io.File;
import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.core.exception.KettleException;
import org.pentaho.di.trans.Trans;
import org.pentaho.di.trans.TransMeta;

public class ExecuteSimpleTransformationWithJndiDatasource {    

    public static void main(String[] args) {

        String resourcesPath = (new File(".").getAbsolutePath())+"\\src\\main\\resources";
        String ktr_path = resourcesPath+"\\transformation_with_jndi.ktr";

        //KETTLE_JNDI_ROOT could be the simple-jndi folder in your pdi or spoon home.
        //in this example, is the resources folder
        String jdbcPropertiesPath = resourcesPath;

        try {
            /**
             * Initialize the Kettle Enviornment
             */
            System.setProperty("KETTLE_JNDI_ROOT", jdbcPropertiesPath);
            KettleEnvironment.init();

            /**
             * Create a trans object to properly assign the ktr metadata.
             * 
             * @filedb: The ktr file path to be executed.
             * 
             */
            TransMeta metadata = new TransMeta(ktr_path);
            Trans trans = new Trans(metadata);

            // Execute the transformation
            trans.execute(null);
            trans.waitUntilFinished();

            // checking for errors
            if (trans.getErrors() > 0) {
                System.out.println("Erroruting Transformation");
            }

        } catch (KettleException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

}

有关完整示例,请查看我的github频道:

https://github.com/jrichardsz/pentaho-pdi-spoon-usefull-templates/tree/master/running-etl-transformation-using-java/invoke-transformation-from-java-jndi/src/main/resources

答案 2 :(得分:1)

我在使用pentaho库的应用程序中遇到了同样的问题。我用这段代码解决了这个问题:

init Kettle的单身人士:

import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.core.exception.KettleException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * Inicia as configurações das variáveis de ambiente do kettle
 * 
 * @author Marcos Souza
 * @version 1.0
 *
 */
public class AtomInitKettle {

    private static final Logger LOGGER = LoggerFactory.getLogger(AtomInitKettle.class);

    private AtomInitKettle() throws KettleException {
        try {
            LOGGER.info("Iniciando kettle");
            KettleJNDI.protectSystemProperty();
            KettleEnvironment.init();
            LOGGER.info("Kettle iniciado com sucesso");
        } catch (Exception e) {
            LOGGER.error("Message: {} Cause {} ", e.getMessage(), e.getCause());
        }
    }
}

救了我的代码:

import java.io.File;
import java.util.Properties;

import org.pentaho.di.core.Const;
import org.pentaho.di.core.exception.KettleException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class KettleJNDI {

    private static final Logger LOGGER = LoggerFactory.getLogger(KettleJNDI.class);

    public static final String SYS_PROP_IC = "java.naming.factory.initial";

    private static boolean init = false;

    private KettleJNDI() {

    }

    public static void initJNDI() throws KettleException {
        String path = Const.JNDI_DIRECTORY;
        LOGGER.info("Kettle Const.JNDI_DIRECTORY= {}", path);

        if (path == null || path.equals("")) {
            try {
                File file = new File("simple-jndi");
                path = file.getCanonicalPath();
            } catch (Exception e) {
                throw new KettleException("Error initializing JNDI", e);
            }
            Const.JNDI_DIRECTORY = path;
            LOGGER.info("Kettle null > Const.JNDI_DIRECTORY= {}", path);
        }

        System.setProperty("java.naming.factory.initial", "org.osjava.sj.SimpleContextFactory");
        System.setProperty("org.osjava.sj.root", path);
        System.setProperty("org.osjava.sj.delimiter", "/");
    }

    public static void protectSystemProperty() {
        if (init) {
            return;
        }

        System.setProperties(new ProtectionProperties(SYS_PROP_IC, System.getProperties()));

        if (LOGGER.isInfoEnabled()) {
            LOGGER.info("Kettle System Property Protector: System.properties replaced by custom properies handler");
        }

        init = true;
    }

    public static class ProtectionProperties extends Properties {

        private static final long serialVersionUID = 1L;
        private final String protectedKey;

        public ProtectionProperties(String protectedKey, Properties prprts) {
            super(prprts);
            if (protectedKey == null) {
                throw new IllegalArgumentException("Properties protection was provided a null key");
            }
            this.protectedKey = protectedKey;
        }

        @Override
        public synchronized Object setProperty(String key, String value) {
            // We forbid changes in general, but do it silent ...
            if (protectedKey.equals(key)) {
                if (LOGGER.isDebugEnabled()) {
                    LOGGER.debug("Kettle System Property Protector: Protected change to '" + key + "' with value '" + value + "'");
                }

                return super.getProperty(protectedKey);
            }

            return super.setProperty(key, value);
        }
    }
}

答案 3 :(得分:1)

我认为您的问题在于数据库的连接。您可以在转换中进行配置,不需要使用JNDI。

public class DatabaseMetaStep {

    private static final Logger LOGGER = LoggerFactory.getLogger(DatabaseMetaStep.class);

     /**
     * Adds the configurations of access to the database
     * 
     * @return
     */
    public static DatabaseMeta createDatabaseMeta() {
        DatabaseMeta databaseMeta = new DatabaseMeta();

        LOGGER.info("Carregando informacoes de acesso");
        databaseMeta.setHostname("localhost");
        databaseMeta.setName("stepName");
        databaseMeta.setUsername("user");
        databaseMeta.setPassword("password");
        databaseMeta.setDBPort("port");
        databaseMeta.setDBName("database");     
        databaseMeta.setDatabaseType("MonetDB"); // sql, MySql ...
        databaseMeta.setAccessType(DatabaseMeta.TYPE_ACCESS_NATIVE);

        return databaseMeta;
    }
}

然后你需要将databaseMeta设置为Transmeta

DatabaseMeta databaseMeta = DatabaseMetaStep.createDatabaseMeta();

        TransMeta transMeta = new TransMeta();
        transMeta.setUsingUniqueConnections(true);
        transMeta.setName("ransmetaNeame");

        List<DatabaseMeta> databases = new ArrayList<>();
        databases.add(databaseMeta);
        transMeta.setDatabases(databases);