我想在java中编写一个通用的flink作业,它可以接受任何SQL-SELECT查询,针对SQL数据库运行它并将其写入Elasticsearch索引。
我必须解决的一个问题是为JDBC-Connection创建一个DataSource。我想使用JDBCInputFormat。我按照documentation data source中的示例进行了操作。
问题是,必须指定泛型DataSource
类型。我只能使用Tuple
类型,因为JDBCInputFormat
泛型类型OUT
扩展了Tuple
。但我不知道在编译时我会使用Tuple
。
InputFormat
吗?Tuple
指定为通用类型?我使用java 7和apache-flink 0.10.2
我尝试使用Tuple25
只包含字符串,但我得到了一个例外。
下面是代码,然后是异常。
DataSource<StringsTuple25> database = flink.createInput(
JDBCInputFormat.buildJDBCInputFormat()//
.setDrivername(getDatabaseDriverName())//
.setDBUrl(getDatabaseUrl())//
.setUsername(getDatabaseUsername())//
.setPassword(getDatabasePassword())//
.setQuery(getQuery())//
.finish(),
StringsTuple25.typeInformation()
);
我的StringTuple25
班级
public class StringsTuple25 extends
Tuple25<String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String, String> {
private static final long serialVersionUID = 1L;
public static TypeInformation<?> typeInformation() {
TypeInformation<String>[] types = new TypeInformation[25];
Arrays.fill(types, STRING_TYPE_INFO);
return new TupleTypeInfo<>(Tuple25.class,types);
}
}
我得到了这个例外:
Caused by: java.io.IOException: Tuple size does not match columncount
at org.apache.flink.api.java.io.jdbc.JDBCInputFormat.extractTypes(JDBCInputFormat.java:180)
at org.apache.flink.api.java.io.jdbc.JDBCInputFormat.nextRecord(JDBCInputFormat.java:162)
at org.apache.flink.api.java.io.jdbc.JDBCInputFormat.nextRecord(JDBCInputFormat.java:51)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:169)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584)
at java.lang.Thread.run(Thread.java:745)
答案 0 :(得分:2)
如错误所示,您使用的Tuple
类型的属性数必须与SQL查询中所选列的数量相匹配。此外,每个属性的数据类型必须匹配。
例如,SELECT id, name FROM ...
id
INTEGER
name
且VARCHAR
DataStream<Tuple2<Integer,String>>
为class MyResultType extends Tuple2<Integer,String>
,您可以指定使用DataStream<MyResultType>
(或专门针对您的TypeInformation
拥有班级Tuple
和DataStream<Tuple>
)并提供相应的TypeInformation
。
您也可以使用通用Tuple t = Tuple.getTupleClass(numberOfAttributes).newInstance();
for(int i = 0; i < numberOfAttributes; i++) {
t.setField("", i);
}
TypeInformation<Tuple> typeInfo = TypeExtractor.getForObject(t);
类型。您的流将为classes: () => Classes.find({_id: { $ne: Meteor.userId() }})
(未指定属性的数量或类型)。但是,对于Every time I ask you for the box labeled 'classes' I want you to go
through the box we called 'Classes' and fill 'classes' with everything
you find that doesn't have the '_id' property set to whatever you find
when you look inside of the box that 'Meteor.userId()' gives you.
,您需要知道属性数量。
Every time I ask you for the box labeled 'classes' I want you to go
through the box we called 'Classes' and fill 'classes' with everything
that you find where the '_id' is set to a certain string that I am passing
you.
因此,您需要从给定定义SQL查询的参数中推断出所选属性的数量。