Question

我在构建Makefile时遇到问题，无法按所需顺序运行我的shell脚本。

这是我当前的makefile

## Create data splits
raw_data: src/data/get_data.sh
    src/data/get_data.sh
    hadoop fs -cat data/raw/target/* >> data/raw/target.csv
    hadoop fs -cat data/raw/control/* >> data/raw/control.csv
    hadoop fs -rm -r -f data/raw
    touch raw_data_loaded

split_data: raw_data_loaded
    rm -rf data/interim/splits
    mkdir data/interim/splits    
    $(PYTHON_INTERPRETER) src/data/split_data.py

## Run Models
random_forest: split_data
    nohup $(PYTHON_INTERPRETER) src/models/random_forest.py > random_forest & 

under_gbm: split_data
    nohup $(PYTHON_INTERPRETER) src/models/undersampled_gbm.py > under_gbm &

full_gbm: split_data
    nohup $(PYTHON_INTERPRETER) src/models/full_gbm.py > full_gbm &

# Create predictions from model files
predictions: random_forest under_gbm full_gbm
    nohup $(PYTHON_INTERPRETER) src/models/predictions.py > predictions &

问题

在我开始##Run Models部分之前，一切正常。这些都是独立的脚本，一旦split_data完成，它们都可以运行。我想同时运行3个模型脚本中的每一个，所以我在后台用＆amp ;.运行每个脚本。

问题是我的上一个任务predictions开始与前面三个任务同时运行。我想要发生的是3个同步模型脚本完成，然后predictions运行。

我的尝试

我建议的解决方案是运行我的最终模型任务，full_gbm而没有＆amp;，这样predictions就不会运行直到完成。这应该有用，但我想知道是否有一个更少的hacky＆＃39;实现这一目标的方法 - 是否有一些方法来构建目标变量以实现相同的结果？

Answer 1

您没有说明让您使用哪种实现方式。如果它是GNU Make，您可以使用-j选项调用它，以允许它决定应并行运行哪些作业。然后，您可以从所有命令中删除nohup和&;在predictions完成所有random_forest under_gbm full_gbm之后，predictions才会启动，并且在for( $i = 1; $i<10; $i + 3 ) { echo cl_image_tag("property".$i.".jpg", array( "alt" => "Sample Image" )); }完成之前，构建本身不会结束。

此外，您不会失去命令的所有重要退出状态。

与Makefile并行运行任务

1 个答案: