基于布尔指示符的特定行的总和,并将结果返回到新列中

时间:2017-06-19 19:07:12

标签: python pandas dataframe data-manipulation

我有一个如下所示的数据框:

DF = 
 ID  Shop  Sales  Ind 
 1   A     554    T
 2   B     678    F
 3   A     546    T
 4   A     896    T
 5   B     426    F
 6   B     391    T
 7   C     998    F
 8   C     565    T
 9   C     128    T

我正在尝试为每个ID计算每个商店的销售额,以便我将它们放在如下的单独列中(其中x是总和)。要求和的值应该只是与Ind变量

中的True值匹配的值
DF2 = 
 ID  Shop  Sales  Ind   A_Sum    B_Sum   C_Sum
 1     A     554    T     x       0       0
 2     B     678    F     0       x       0
 3     A     546    T     x       0       0
 4     A     896    T     x       0       0
 5     B     426    F     0       x       0
 6     B     391    T     0       x       0
 7     C     998    F     0       0       x
 8     C     565    T     0       0       x
 9     C     128    T     0       0       x

我试过这个,但我远非正确!我坚持如何在sum操作中编码布尔索引?并自动命名列

DF2 = DF.groupby(['ID', 'Shop'])['Sales'].transform('sum')   

对此有何帮助?

3 个答案:

答案 0 :(得分:1)

根据您的努力

#include <map>
#include <iostream>

class StateInstance
{
    std::string m_string;

public: 

    StateInstance(const std::string& string)
        : m_string{string}
    {
    }

    std::string Get() const
    {
        return m_string;
    }
}instance_a("hello world"), instance_b("bring me coffee");

enum class StateInstanceOption
{
    STATE_INSTANCE_A,
    STATE_INSTANCE_B
}gCurrentState{StateInstanceOption::STATE_INSTANCE_A}; // global variable to hold current state "pointer" (really a flag)

class StateInstanceMapper
{
    std::map<StateInstanceOption, const StateInstance&> m_map;

public:

    StateInstanceMapper()
    {
        m_map.insert(std::pair<StateInstanceOption, const StateInstance&>(StateInstanceOption::STATE_INSTANCE_A, instance_a));
        m_map.insert(std::pair<StateInstanceOption, const StateInstance&>(StateInstanceOption::STATE_INSTANCE_B, instance_b));
    }

    const StateInstance& DoMap(/*const StateInstanceOption opt*/) const
    {
        return m_map.at(/*opt*/ gCurrentState);
    }

}mapper_instance;

int main()
{

    std::cout << mapper_instance.DoMap(/*gCurrentState*/).Get() << std::endl;

    gCurrentState = StateInstanceOption::STATE_INSTANCE_B;

    std::cout << mapper_instance.DoMap(/*gCurrentState*/).Get() << std::endl;

    return 0;
}

答案 1 :(得分:0)

您可能想要这样的东西吗?

Shop = ["A", "B", "A", "A", "B", "B", "C", "C", "C"]
Sales = [554, 678, 546, 896, 426, 319, 998, 565, 128]
List = ["A", "B", "C"]
A = []
B = []
C = []
Ticker = 0
for x in range(len(Sales)):
    if Shop[Ticker] == "A":
        A.append(Sales[Ticker])
    elif Shop[Ticker] == "B":
        B.append(Sales[Ticker])
    else:
        C.append(Sales[Ticker])
    Ticker += 1
print(sum(A), sum(B), sum(C))

答案 2 :(得分:0)

你可以这样做

df.merge(df.groupby(['ID','Shop']).Sales.sum().unstack(fill_value = 0).reset_index(), on = 'ID').rename(columns = {'A': 'A_sum', 'B': 'B_sum', 'C': 'C_sum'})


    ID  Shop    Sales   Ind A_sum   B_sum   C_sum
0   1   A       554     T   554     0       0
1   2   B       678     F   0       678     0
2   3   A       546     T   546     0       0
3   4   A       896     T   896     0       0
4   5   B       426     F   0       426     0
5   6   B       391     T   0       391     0
6   7   C       998     F   0       0       998
7   8   C       565     T   0       0       565
8   9   C       128     T   0       0       128

另一种没有合并或连接且速度更快的解决方案会产生相同的结果

df[['ID','A_sum', 'B_sum', 'C_sum']] = df.groupby(['ID','Shop']).Sales.sum().unstack(fill_value = 0).reset_index()