“功能”&大熊猫中的“for”逻辑

时间:2015-05-15 16:00:50

标签: python python-2.7 python-3.x pandas

我有一个输入参数通过参数字典传递,如下所示 -

paramDict = {  "Period":
                    {
                    "Description": "A Period",
                    "Value" : ['9']
                    },
               "AdditionalPeriods":
                        {
                        "Description": "An AdditionalPeriod",
                        "Value" : ['1']
                        }
            }

而且,我有一个数据帧“df_AssetCst”,如下所示 -

>>> df_AssetCst.dtypes
FLCO                object
FLN01               object
FLN02               object
FLN03               object
FLN04               object
FLN05               object
FLN06               object
FLN07               object
FLN08               object
FLN09               object
FLN10               object
FLN11               object
FLN12               object
FLN13               object
FLN14               object

现在基于传递给参数字典的值,我想在python pandas中实现下面的“if-else”条件 -

def func(row):
    if pd.Series(paramDict['AdditionalPeriods']['Value'][0]) == '0':
        return '0'        
    elif pd.Series(paramDict['AdditionalPeriods']['Value'][0]) == '1':
        return df_AssetCst['FLN13']   
    elif pd.Series(paramDict['AdditionalPeriods']['Value'][0]) == '2':
        return (df_AssetCst['FLN13'].astype(int)
              + df_AssetCst['FLN14'].astype(int)) 
    else:
        return 'other'  

在sql中,上面的案例逻辑如下 -

Case AdditionalPeriods = 0, Then NewColumn = 0
Case AdditionalPeriods = 1, Then NewColumn = FLN013
Case AdditionalPeriods = 2, Then NewColumn = FLN013 + FLN014

现在,我想使用该函数在数据框中创建一个新列 -

df_AssetCst['NewColumn'] = df_AssetCst.apply(func, axis=1)

但是,这给了我以下错误 -

ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', u'occurred at index 0')

在此之后,我想实现下面的“For”逻辑 -

if Period.value = '9'
then NewColumn2 =  FLN01+FLN02+FLN03+FLN04+FLN05+FLN06+FLN07+FLN08+FLN09 

你能帮我指点一些方向吗?实现同样的最好方法。谢谢

**********我的解决方案***********

#1. Function logic :
def func(row):   
    if paramDict['AdditionalPeriods']['Value'][0] == '0':
        var = 0       
    elif paramDict['AdditionalPeriods']['Value'][0] == '1':
        var = int(row['FLN13'])   
    elif paramDict['AdditionalPeriods']['Value'][0] == '2':
        var = int(row['FLN13']) + int(row['FLN14'])                                 
    else:
        var = -1

    return(var) 

#2. For logic
In_Period = paramDict['Period']['Value'][0]
colList = ['FLN{:0>2}'.format(X) for X in range(1, In_Period + 1)]
df_AssetCst['NewColumn1'] = df_AssetCst[colList].astype(int).sum(axis=1)

1 个答案:

答案 0 :(得分:1)

使用您在更新的答案(即您的解决方案)中编写的func函数,您应该能够将DataFrame.apply方法与参数axis=1一起使用。 (我还没有测试过,但也许您可以尝试应用它并报告错误消息,如果有的话)

但是,在这个函数中你指的是全局范围内的param_dict,虽然它有效,但如果你不小心可能会导致意想不到的后果,IMO可能会导致更多的头痛

这是func函数的另一个版本。这个参数作为参数的一行pd.Series类型和param_dict(你在问题中提供的字典)

def func(row, param_dict):
   """
   Gets the key from the param_dict, and tries to return the element
   from _conversion_map.
   if the key doesn't exist in the conversion map, then returns -1
   """
   key = param_dict['AdditionalPeriods']['Value'][0] 
   _conversion_map = {
        '0': 0,
        '1': int(row.FLN13),
        '2': int(row.FLN13) + int(row.FLN14)
        }
    try:
        return _conversion_map[key]
    except KeyError:
        return -1

然后这应该有效:

df_AssetCst['NewColumn'] = df_AssetCst.apply(func, axis=1, param_dict=param_dict)

这个答案的关键是apply方法接受任意位置&关键字参数并将它们传递给函数。