我正在尝试使用Airflow来实现基本的ETL工作,但停留在一点:
我有3个功能。我想为每个变量定义全局变量,如:
Comparable
然后在public static <T, G extends Comparable<? extends G>> List<G> sortedToListOfNewType(
List<T> inputList,
Function<T, G> mapperFunction) {
return Stream.ofNullable(inputList)
.flatMap(List::stream)
.map(mapperFunction)
.sorted()
.collect(Collectors.toList());
}
public static <T, G> List<G> toListOfNewType(
List<T> inputList,
Function<T, G> mapperFunction) {
return Stream.ofNullable(inputList)
.flatMap(List::stream)
.map(mapperFunction)
.collect(Collectors.toList());
}
中使用这些功能。
照常定义function a():
return a_result
function b():
use a
return b_result
function c():
use a and b
无效。有解决方案吗?
答案 0 :(得分:1)
正如我在评论中写道,
当您在
python_callable
中返回内容时,如果将任务上下文传递给下一个运算符,则可以访问返回的值。 https://airflow.apache.org/concepts.html?highlight=xcom
以下是说明此想法的半伪代码
# inside a PythonOperator called 'pushing_task'
def push_function():
return value
# inside another PythonOperator where provide_context=True
def pull_function(**context):
value = context['task_instance'].xcom_pull(task_ids='pushing_task')
pushing_task = PythonOperator('pushing_task',
push_function, ...)
pulling_task = PythonOperator('pulling_task',
pull_function,
provide_context=True ...)