如何将脚本用作命令行应用程序或模块

时间:2016-03-02 00:25:32

标签: python command-line argparse

我觉得这应该是显而易见的,但我一直在谷歌上搜索这个并且我还没有找到解决方案,所以如果你愿意的话我会感激你的帮助:

我编写了一个python脚本,我希望它能够用作命令行应用程序,但也可以作为我可以在其他应用程序中加载的模块。到目前为止,我只编写了命令行选项的代码。以下是我的文件etl.py如何构建的摘要:

import os
import re
import sys
import shlex
import argparse
import itertools
import subprocess
from sqlalchemy import create_engine, text
# other imports here...

def etl(argv):
    """
    ETL function for rasters ...
    # more doc here
    """
    args = parser.parse_args(argv)
    d = vars(args)

    os.chdir(d.get("root_dir"))
    files = [f for f in os.listdir(".") for p in d.get("products") if p in f]

    # a) Reprojection (option "r")
    if d.get("which") == "r":
        for f in files:
            ofile = os.path.abspath(os.path.join(d.get("reproj_dir"), f))
            if os.path.exists(ofile):
                if d.get("overwrite") is True:
                    gdutil.reproject(os.path.abspath(f), ofile,    .get("proj"))
                else:
                    print("{}: File already exists. skipping".format(ofile))
            else:
                gdutil.reproject(os.path.abspath(f), ofile, d.get("proj"))
          print("All files reprojected into EPSG {}".format(d.get("proj")))

    # b) Merge (option "m")
    if d.get("which") == "m":
        # more operations here until...

if __name__ == '__main__':
    parser = argparse.ArgumentParser(
        description="""
        'Extract, Transform, Load' script for rasters ...""",
        formatter_class=argparse.RawDescriptionHelpFormatter)
    parser.add_argument("-dir", help="""
        directory containing the input rasters to process

        default value is the current directory""",
        dest="root_dir", default=os.getcwd(), const=os.getcwd(), nargs="?", metavar="DIR")
    gen_opts = argparse.ArgumentParser(add_help=False)
    gen_opts.add_argument("-p", help="""
        product type(s) to process

        default is to process all products in DIR.           

        """.format(os.path.basename(sys.argv[0])),
        dest="products", nargs="*", metavar="PROD")

    # more gen_opts arguments here

    subparsers = parser.add_subparsers()
    parser_r = subparsers.add_parser("r", help="""
        reproject rasters into target reprojection

        if rasters corresponding to the passed arguments are found, they will be
        reprojected into the target spatial reference system as defined by
        the passed EPSG code

        reprojected files will have the same pixel resolution as the input
        rasters

        resampling method is nearest neighbor     

        reprojected files are saved to the folder specified with option -r_dir
        or to its default value if missing
        """.format(os.path.basename(sys.argv[0])),
        parents=[gen_opts, overw_opts])
    parser_r.add_argument("proj", help="""
        EPSG code of the destination reprojection""",
        type=int, metavar="EPSG_CODE")

    # more parser_r arguments here

    parser_r.set_defaults(which="r")
    parser_m = subparsers.add_parser("m", help="""
        merge input rasters into a mosaic

        rasters are merged based on the supplied level

        all files that are included in a merge must have the same projection,
        pixel size, no-data value, and be unique in space. otherwise, results
        are not guaranteed""", parents=[gen_opts, group_opt, overw_opts])
    parser_m.add_argument("-m_dir", help="""
        if supplied, the merged rasters will be saved in directory M_DIR

        if not supplied they will be saved to a subfolder named 'merged'
        located at in DIR, or in R_DIR if a reprojection was made in the same
        call""",
        dest="merge_dir", type=str, nargs=1, metavar="M_DIR")
    parser_m.set_defaults(which="m")

    # more parsers and arguments here

    # etl(sys.argv)
    etl(["-dir", "E:\\Data\\Spatial\\soilgrids.org", "r", "3175", "-c", "M"])

到目前为止,我只能在最后一行取消注释的情况下运行它并从终端调用脚本而没有参数,但这不是现在的主要问题。

我的问题是:如何将脚本用作我可以在其他脚本中导入的模块,例如来自etl import reproject?

我认为有一件事就是放置我的部分代码(即每个部分以注释开头,如a)重投影,b)合并,c)...)在他们自己的函数中:

def reproject():
    # add code here

def merge():
    # add code here

然后为每个解析器添加默认函数(例如parser_r.set_defaults(func = reproject))但是,如果我还要从我要导入的另一个应用程序中使用它们,我将如何定义每个函数定义的参数模块etl,例如:

from etl import reproject
reproject() #  arguments?

我是否必须添加可选参数或关键字?我是否必须测试是否可以使用parser.parse_args解析参数?我该怎么做?

非常感谢你的帮助!

1 个答案:

答案 0 :(得分:1)

您必须更改功能的呼叫签名(和正文)

def etl(argv):
    ...

因此它不会对argv执行任何操作。相反,它看起来像:

def etl(products, reproj_dir, ...):
    ...

这称为您的函数定义接口

您可以将所有参数解析内容保留在if __name__ == '__main__':块内,但是当您调用etl时,您应该使用sys.argv的内容。 您也应该移动在此区块内调用parser.parse_args的行。

查看函数的主体,它似乎有两种模式,所以确实最好def reprojectdef merge,然后确定那些接口,就像你一样思维。

一旦有了合适的接口,只需导入模块并直接调用这些功能即可。实际的功能根本不需要了解命令行界面。