将列表作为参数传递给subprocess.check_call()中调用的bash脚本

时间:2018-06-01 02:19:43

标签: python arrays bash subprocess

我有一个Python脚本,用于查询数据库并获取结果集。然后,我使用一组使用subprocess.check_call()调用的bash脚本对此结果集执行一系列步骤。所以目前,这是我的Python脚本:

...
...

def get_ids_and_process():
    bqc = bigquery.Client.from_service_account_json(gcloud_account_key)
    query = (
        'SELECT ProjectID, ProjectName, UserEmail from `table_name`')
    query_job = bqc.query(query)
    data = query_job.result()
    data_rows = list(data)
    if len(data_rows) == 0:
        sys.exit()
    else:
        for row in data_rows:
            subprocess.check_call(
                [scripts_dir + "create_project.sh", str(row['ProjectID']), str(row['ProjectName']), folder_id])
            subprocess.check_call([scripts_dir + "link_billing.sh", str(row['ProjectID']), account_id])
            subprocess.check_call([scripts_dir + "add_iam_policy.sh", str(row['ProjectID']), str(row['UserEmail'])])
            subprocess.check_call([scripts_dir + "create_datasets.sh", str(row['ProjectID'])])
            subprocess.check_call([scripts_dir + "create_tables.sh", str(row['ProjectID'])])
...
...

由于这本质上是迭代的,执行所有这些脚本需要一段时间,因此我想到使用joblibpool.map()来并行化循环。然而,它没有用,我达到了#34;达到最大递归深度!" joblib出错,pool.map()出现类似错误。

在我的几个bash脚本中,我读取了一个文件,对于文件的每个值,我运行一个命令以及从check_call()传递给它的参数。我使用&并行化了这个循环,它完美地工作。所以我认为,由于joblib没有成功,我可以将结果集作为列表传递给bash脚本并在那里并行处理数组的每个元素。像这样:

projectids = []
projectnames = []
emails = []

def process_ids():
    subprocess.check_call(
        [scripts_dir + "create_project_from_array.sh", projectids, projectnames])
    subprocess.check_call([scripts_dir + "link_billing_from_array.sh", projectids])

def get_ids():
    bqc = bigquery.Client.from_service_account_json(gcloud_account_key)
    query = (
        'SELECT ProjectID, ProjectName, UserEmail from `table_name`')
    query_job = bqc.query(query)
    data = query_job.result()
    data_rows = list(data)
    if len(data_rows) == 0:
        sys.exit()
    else:
        for row in data_rows:
            projectids.append(row['ProjectID'])
            projectnames.append(row['ProjectName'])
            emails.append(row['UserEmail'])

        process_ids()


get_ids()

然后,在我的bash脚本中,我读取了数组和每对元素,使用&并行运行所需的命令。但是,显然,我们无法将列表传递给subprocess.check_call()。在这种情况下我有什么选择?是否有可能做我想做的事情?

0 个答案:

没有答案