Question

如果有人告诉我如何完成下面的任务，我将非常感激。假设我在python中有一个数据框，如下所示：

#include <iostream>

// Creating a Class for the type of phone.
class Phone {
public:
std::string manufacturer;
std::string model;
int capacity; //in GB

// Creating a Constructor. This will be called whenever we create a "Phone" object.
Phone(std::string aManufacturer, std::string aModel, int aCapacity) 
  {
    manufacturer = aManufacturer;
    model = aModel;
    capacity = aCapacity;
  }
};

int main()
{
// Objects
Phone iPhone("Apple", "6s", "64"); // This is where the error occurs
}

如果col1，col2和col3中的对应值相同，我想获取col4的平均值，然后除去前3列中具有重复值的行。例如，第一两列的col1，col2，col3的值是相同的，因此，我们要消除其中一个，并将col4的值更新为5和4的平均值。结果应为：

  col1 col2 col3 col4
0    A 2001    2    5
1    A 2001    2    4
2    A 2001    3    6
3    A 2002    4    5
4    B 2001    2    9
5    B 2001    2    4
6    B 2001    2    3
7    B 2001    3   95

Answer 1

使用groupby将'col1'和'col2'和'col3'分组，然后获取'col4'列的均值：

print(df.groupby(['col1','col2','col3'],as_index=False)['col4'].mean())

输出：

  col1  col2  col3       col4
0    A  2001     2   4.500000
1    A  2001     3   6.000000
2    A  2002     4   5.000000
3    B  2001     2   5.333333
4    B  2001     3  95.000000

如何根据另一列中对应值的相似性获得一列值的平均值

1 个答案: