Pandas Apply(),Transform()ERROR = get_concat_dtype中无效的dtype确定

时间:2015-09-02 16:47:15

标签: pandas

从此question开始,我将其作为背景链接,但问题是独立的。

4个问题:

  1. 我无法理解使用apply或transform时看到的错误: “get_concat_dtype中的dtype确定无效”
  2. 为什么ClipNetMean可以工作,但其他两种方法没有?
  3. 不确定是否或为何我需要 .copy(deep = True)
  4. 为什么调用InnerFoo函数需要稍微不同的语法
  5. DataFrame:

                  cost
    section item      
    11      1       25
            2      100
            3       77
            4       10
    12      5       50
            1       39
            2        7
            3       32
    13      4       19
            1       21
            2       27
    

    代码:

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame(data = {'section' : [11,11,11,11,12,12,12,12,13,13,13]
                       ,'item' : [1,2,3,4,5,1,2,3,4,1,2]
                       ,'cost' : [25.,100.,77.,10.,50.,39.,7.,32.,19.,21.,27.]
                  })
    df.set_index(['section','item'],inplace=True)
    
    upper =50
    lower = 10
    
    def ClipAndNetMean(cost,upper,lower):
        avg = cost.mean()
        new_cost = (cost- avg).clip(lower,upper)
        return new_cost
    
    def MiniMean(cost,upper,lower):
        cost_clone = cost.copy(deep=True)
        cost_clone['A'] = lower
        cost_clone['B'] = upper
        v  = cost_clone.apply(np.mean,axis=1)
        return v.to_frame()
    
    def InnerFoo(lower,upper):
        def inner(group):
            group_clone = group.copy(deep=True)
            group_clone['lwr'] = lower
            group_clone['upr'] = upper
            v  = group_clone.apply(np.mean,axis=1)
            return v.to_frame()
        return inner
    
    #These 2 work fine.
    print df.groupby(level = 'section').apply(ClipAndNetMean,lower,upper)
    print df.groupby(level = 'section').transform(ClipAndNetMean,lower,upper)
    
    #apply works but not transform
    print df.groupby(level = 'section').apply(MiniMean,lower,upper)
    print df.groupby(level = 'section').transform(MiniMean,lower,upper)
    
    #apply works but not transform    
    print df.groupby(level = 'section').apply(InnerFoo(lower,upper))
    print df.groupby(level = 'section').transform(InnerFoo(lower,upper))
    
    exit()
    

    所以对Chris的回答,请注意,如果我添加回列标题,方法将在Transform调用中工作。

    请参阅 v.columns = ['cost']

    def MiniMean(cost,upper,lower):
        cost_clone = cost.copy(deep=True)
        cost_clone['A'] = lower
        cost_clone['B'] = upper
        v  = cost_clone.apply(np.mean,axis=1)
        v = v.to_frame()
        v.columns = ['cost']
        return v
    
    def InnerFoo(lower,upper):
        def inner(group):
            group_clone = group.copy(deep=True)
            group_clone['lwr'] = lower
            group_clone['upr'] = upper
            v  = group_clone.apply(np.mean,axis=1)
            v = v.to_frame()
            v.columns = ['cost']
            return v
        return inner 
    

1 个答案:

答案 0 :(得分:1)

1& 2)transform期望“喜欢索引”,而apply则是灵活的。这两个失败的函数正在添加其他列。

3)在某些情况下,(例如,如果您将整个DataFrame传递给函数),可能需要复制以避免改变原始函数。这里没有必要。

4)前两个函数使用带有两个参数的DataFrame并返回数据。 InnerFoo实际上返回另一个函数,因此需要在传递给apply之前调用它。