Question

我想建立一个熊猫随机数据框。为了达到这个目的，我需要一个以参数为参数的Python函数：

numpy发行版
他们的论点

例如：

分布1：正常|参数：平均值= 0，标准dev = 1，大小= 100

分布2：统一|参数：low = 0，high = 1，size = 100

等...

我事先不知道什么是不同的分布及其参数。

然后，main函数将使用每个相应的参数生成分布的随机样本。

我尝试过类似的事情：

import numpy as np

def myfun( **kwargs ) :
    for k , v in kwargs.items() :
        print( k )
        print( v )

当我使用以下参数调用该函数时：

myfun( fun_1 = 'np.random.normal' , arg_1 = { 'loc' : 0 , 'scale' : 1 , 'size' : 7 } ,
       fun_2 = 'np.random.uniform' , arg_2 = { 'low' : 0 , 'high' : 1 , 'size' : 7 } )

输出为：

fun_1
np.random.normal
arg_1
{'loc': 0, 'scale': 1, 'size': 7}
fun_2
np.random.uniform
arg_2
{'low': 0, 'high': 1, 'size': 7}

但是我的目的不是打印所需的分布及其相关参数，而是为每个分布生成一个样本。

Answer 1

注意，要使此实现正常工作，函数应该是函数，而不是字符串

如果要返回以一组kwargs调用的函数，则非常接近。我将为func使用一个位置参数，然后可以将kwargs传递到func中，这会更加明确：

def myfunc(func, **kwargs):
    return func(**kwargs)

然后，您可以将每对func, **kwargs包装为元组，并进行for循环：

# This would be called like
somelist = [(np.random.normal, { 'loc' : 0 , 'scale' : 1 , 'size' : 7 }),
            (np.random.uniform , { 'low' : 0 , 'high' : 1 , 'size' : 7 })]

results = []

# append results to a list
for func, kwargs in somelist:
    results.append(myfunc(func, **kwargs))

通过这种方式，您不必担心任何变量的 name 名称，而且可读性更高。您知道循环将处理成对的项目，在这种情况下为func, kwarg对，您的函数可以显式地处理这些项目

处理字符串调用

因此，有几种方法可以完成此任务，但有些棘手，但总的来说并不可怕。您需要修改myfunc才能处理函数名称：

# func is now a string, unlike above

def myfunc(func, **kwargs):
    # function will look like module.class.function
    # so split on '.' to get each component. The first will 
    # be the parent module in global scope, and everything else
    # is collected into a list
    mod, *f = func.split('.') # f is a list of sub-modules like ['random', 'uniform']
    # func for now will just be the module np
    func = globals().get(mod)
    for cls in f:
        # get each subsequent level down, which will overwrite func to
        # first be np.random, then np.random.uniform
        func = getattr(func, cls)
    return func(**kwargs)

我使用globals().get(mod)的原因是：a）我假设您可能并不总是使用同一模块，并且b）从sys.modules调用重命名的导入将产生一个{{1 }}，这不是您想要的：

KeyError

然后import sys import numpy as np sys.modules['np'] # KeyError sys.modules['numpy'] # <module 'numpy.random' from '/Users/mm92400/anaconda3/envs/new36/lib/python3.6/site-packages/numpy/random/__init__.py'> # globals avoids the naming conflict globals()['np'] # <module 'numpy.random' from '/Users/mm92400/anaconda3/envs/new36/lib/python3.6/site-packages/numpy/random/__init__.py'>将返回随后的每个模块：

getattr(obj, attr)

所以，总共：

import numpy as np

getattr(np, 'random')
# <module 'numpy.random' from '/Users/mm92400/anaconda3/envs/new36/lib/python3.6/site-packages/numpy/random/__init__.py'>

# the dotted access won't work directly
getattr(np, 'random.uniform')
# AttributeError

您可以将其扩展到第一部分中的代码

Answer 2

您可以设计一个将其他功能作为输入并执行它们的功能。 **运算符就是这样做的：

def myfun(**kwargs):
    kwargs['fun_1'](**kwargs['arg_1'])  # calls the function kwargs[fun_1] with the keyword args given in kwargs[arg_1]
    kwargs['fun_2'](**kwargs['arg_2'])

然后您将这样指定您的kwarg：

myfun(fun_1=np.random.normal, 
      arg_1={'loc': 0, 'scale': 1, 'size': 7},
      fun_2=np.random.uniform,
      arg_2={'low': 0, 'high': 1, 'size': 7},
     )

请注意np.random.normal不在引号中-我们通过引用引用实际函数，但尚未调用它（因为我们想在myfun()中进行此操作，而不是现在）。 / p>

我不认为此运算符有一个正式名称（*用于列表，**用于字典），但是我将其称为unpacking运算符，因为它 unpacks 将数据结构转换为函数参数。

在这种情况下，声明明确的命名参数通常更安全[em]-您需要提出一种模式，以便使用您的函数的人知道应该如何命名关键字。

具有子函数及其参数作为参数的函数

2 个答案:

注意，要使此实现正常工作，函数应该是函数，而不是字符串

处理字符串调用