如何将多个函数应用于单个pandas dataframe列?

时间:2017-02-21 14:24:01

标签: python python-3.x pandas map-function

我很好奇是否可以将几个函数应用于单个pandas dataframe列。例如,假设我有三个函数:

在:

def foo(col):
    if 'hi' in col:
        return 'TRUE'

def bar(col):
    if 'bye' in col:
        return 'TRUE'

def baz(col):
    if 'ok' in col:
        return 'TRUE'

以下数据框:

dfs = pd.DataFrame({'col':['The quick hi brown fox hi jumps over the lazy dog', 
                           'The quick hi brown fox bye jumps over the lazy dog', 
                           'The NO quick brown fox ok jumps bye over the lazy dog']})

如果我想将每个函数应用于col,通常我会使用pandas apply函数:

dfs['new_col1'] = dfs['col'].apply(foo)

dfs['new_col2'] = dfs['col'].apply(bar)

dfs['new_col3'] = dfs['col'].apply(baz)

dfs

输出:

    col     new_col1    new_col2    new_col3
0   The quick hi brown fox hi jumps over the lazy dog   TRUE    None    None
1   The quick hi brown fox bye jumps over the lazy...   TRUE    TRUE    None
2   The NO quick brown fox ok jumps bye over the l...   None    TRUE    TRUE

但是,正如您所看到的,我创建了3列。因此,我的问题是如何在大型数据帧中有效地将上述3个函数同时应用到特定列?,预期结果应为:

    col                                                 new_col
0   The quick hi brown fox hi jumps over the lazy dog   TRUE
1   The quick hi brown fox bye jumps over the lazy...   TRUE, TRUE
2   The NO quick brown fox ok jumps bye over the l...   TRUE, TRUE

请注意,我知道我可以在一个列中合并3列。不过,我想知道上述问题是否可行。

4 个答案:

答案 0 :(得分:4)

为什么不将所有函数都归为一个巨型函数?

def oneGaintFunc(col):    
    def foo(col):
        if 'hi' in col:
            return 'TRUE'

    def bar(col):
        if 'bye' in col:
            return 'TRUE'

    def baz(col):
        if 'ok' in col:
            return 'TRUE'

    a = foo(col)
    b = bar(col)
    c = baz(col)
    return '{} {} {}'.format(a, b, c)

df['new_col'] = df['col'].apply(oneGiantFunc)

答案 1 :(得分:2)

您可以将applylist comprehension一起使用过滤None值:

dfs['new_col'] = dfs['col'].apply(lambda x: (', '.join([x for x in 
                                            [foo(x), bar(x), baz(x)] if x != None])))
print (dfs)
                                                 col     new_col
0  The quick hi brown fox hi jumps over the lazy dog        TRUE
1  The quick hi brown fox bye jumps over the lazy...  TRUE, TRUE
2  The NO quick brown fox ok jumps bye over the l...  TRUE, TRUE

答案 2 :(得分:1)

我认为你不能“同时”做到这一点。 但是,这里有2个选项

1。假设函数定义为:

dfs['new_col1'] = (dfs['col'].apply(foo)&dfs['col'].apply(bar))&dfs['col'].apply(baz)

2. 重新定义功能

def foo(aao): # all at once
    if ('hi' in col) and ('bye' in col) and ('ok' in col):
        return 'TRUE'

dfs['new_col'] = dfs['col'].apply(aao)

答案 3 :(得分:1)

使用lambda函数,例如

lambda x: ', '.join([f(x) for f in [foo, bar, baz] if f(x)])

在通话申请中。完整的例子:

In : dfs['new_col'] = dfs['col'].apply(lambda x: ', '.join([f(x) for f in [foo, bar, baz] if f(x)]))

In : dfs
Out: 
                                                 col     new_col
0  The quick hi brown fox hi jumps over the lazy dog        TRUE
1  The quick hi brown fox bye jumps over the lazy...  TRUE, TRUE
2  The NO quick brown fox ok jumps bye over the l...  TRUE, TRUE