我是Airflow的新手
我遇到过一个场景,其中Parent DAG需要将一些动态数字(比方说 If ((DataGridView1.Columns.Count = 0) Or (DataGridView1.Rows.Count = 0)) Then
Exit Sub
End If
Dim dset As New DataSet
dset.Tables.Add()
For i As Integer = 0 To DataGridView1.ColumnCount - 1
dset.Tables(0).Columns.Add(DataGridView1.Columns(i).HeaderText)
Next
Dim dr1 As DataRow
For i As Integer = 0 To DataGridView1.RowCount - 1
dr1 = dset.Tables(0).NewRow
For j As Integer = 0 To DataGridView1.Columns.Count - 1
dr1(j) = DataGridView1.Rows(i).Cells(j).Value
Next
dset.Tables(0).Rows.Add(dr1)
Next
Dim excel As New Microsoft.Office.Interop.Excel.Application
Dim wBook As Microsoft.Office.Interop.Excel.Workbook
Dim wSheet As Microsoft.Office.Interop.Excel.Worksheet
wBook = excel.Workbooks.Add()
wSheet = wBook.ActiveSheet()
Dim dt As System.Data.DataTable = dset.Tables(0)
Dim dc As System.Data.DataColumn
Dim dr As System.Data.DataRow
Dim colIndex As Integer = 0
Dim rowIndex As Integer = 0
For Each dc In dt.Columns
colIndex = colIndex + 1
excel.Cells(1, colIndex) = dc.ColumnName
Next
For Each dr In dt.Rows
rowIndex = rowIndex + 1
colIndex = 0
For Each dc In dt.Columns
colIndex = colIndex + 1
excel.Cells(rowIndex + 1, colIndex) = dr(dc.ColumnName)
Next
Next
wSheet.Columns.AutoFit()
Dim strFileName As String = "D:\testehorario.xlsx"
Dim blnFileOpen As Boolean = False
Try
Dim fileTemp As System.IO.FileStream = System.IO.File.OpenWrite(strFileName)
fileTemp.Close()
Catch ex As Exception
blnFileOpen = False
End Try
If System.IO.File.Exists(strFileName) Then
System.IO.File.Delete(strFileName)
End If
wBook.SaveAs(strFileName)
excel.Workbooks.Open(strFileName)
excel.Visible = True
)传递给Sub DAG。
SubDAG将使用此数字动态创建Dim sfd As New SaveFileDialog() ' this creates an instance of the SaveFileDialog called "sfd"
sfd.Filter = "txt files (*.xlsx)|*.xlsx|All files (*.*)|*.*"
sfd.FilterIndex = 1
sfd.RestoreDirectory = True
If sfd.ShowDialog() = DialogResult.OK Then
Dim FileName As String = sfd.FileName ' retrieve the full path to the file selected by the user
Dim sw As New System.IO.StreamWriter(FileName, False) ' create a StreamWriter with the FileName selected by the User
sw.WriteLine(TextBox1.Text) ' Write the contents of TextBox to the file
sw.Close() ' close the file
End If
并行任务。
Airflow文档未涵盖实现此目的的方法。所以我有几种方法探讨:
我试图传递为xcom值,但由于某种原因,SubDAG没有解析为传递的值。
家长Dag档案
n
Sub Dag档案
n
我还尝试将def load_dag(**kwargs):
number_of_runs = json.dumps(kwargs['dag_run'].conf['number_of_runs'])
dag_data = json.dumps({
"number_of_runs": number_of_runs
})
return dag_data
# ------------------ Tasks ------------------------------
load_config = PythonOperator(
task_id='load_config',
provide_context=True,
python_callable=load_dag,
dag=dag)
t1 = SubDagOperator(
task_id=CHILD_DAG_NAME,
subdag=sub_dag(PARENT_DAG_NAME, CHILD_DAG_NAME, default_args, "'{{ ti.xcom_pull(task_ids='load_config') }}'" ),
default_args=default_args,
dag=dag,
)
作为全局变量传递,但这不起作用。
我们也尝试将此值写入数据文件。但是子DAG正在抛出def sub_dag(parent_dag_name, child_dag_name, args, num_of_runs):
dag_subdag = DAG(
dag_id='%s.%s' % (parent_dag_name, child_dag_name),
default_args=args,
schedule_interval=None)
variabe_names = {}
for i in range(num_of_runs):
variabe_names['task' + str(i + 1)] = DummyOperator(
task_id='dummy_task',
dag=dag_subdag,
)
return dag_subdag
。这可能是因为我们正在动态生成此文件。
有人可以帮助我。
答案 0 :(得分:2)
我已经使用选项3完成了它。关键是如果文件不存在则返回没有任务的有效dag。因此,如果需要,load_config将生成包含您的任务数量或更多信息的文件。您的子工厂看起来像:
def subdag(...):
sdag = DAG('%s.%s' % (parent, child), default_args=args, schedule_interval=timedelta(hours=1))
file_path = "/path/to/generated/file"
if os.path.exists(file_path):
data_file = open(file_path)
list_tasks = data_file.readlines()
for task in list_tasks:
DummyOperator(
task_id='task_'+task,
default_args=args,
dag=sdag,
)
return sdag
在dag生成中,您将看到一个没有任务的子标记。在执行dag时,在load_config完成后,您可以看到动态生成的子标记
答案 1 :(得分:1)
如果仅将呼叫更改为xcom_pull
以包括父dag的dag_id
,则选项1应该起作用。默认情况下,xcom_pull
调用将在其自身不存在的dag中查找task_id
'load_config'
。
因此将x_com调用宏更改为:
subdag=sub_dag(PARENT_DAG_NAME, CHILD_DAG_NAME, default_args, "'{{ ti.xcom_pull(task_ids='load_config', dag_id='" + PARENT_DAG_NAME + "' }}'" ),
答案 2 :(得分:0)
如果您要写入的文件名不是动态文件(例如,您正在为每个任务实例一遍又一遍地重写同一文件),Jaime的答案将起作用:
file_path = "/path/to/generated/file"
但是,如果您需要唯一的文件名或希望每个任务实例将不同的内容写入并行执行的任务的文件中,则在这种情况下,气流将无法正常工作,因为无法将执行日期或变量传递到外部模板。看一下this post。
答案 3 :(得分:0)
看看我的答案here,其中我描述了一种基于先前使用xcoms和subdags执行的任务的结果动态创建任务的方法。