我有一种情况,我想按个人定义的星期索引对数据集进行分组,然后对它们进行平均,然后将平均值汇总到“总计”行中。我可以实现方案的上半部分,但是当我尝试添加/插入新的“总计”行来汇总这些行时,我会收到错误消息。
我尝试通过两种不同的方法创建此行:
方法1:
week_index_avg_unit.loc['Total'] = week_index_avg_unit.sum()
TypeError: cannot append a non-category item to a CategoricalIndex
方法2:
week_index_avg_unit.index.insert(['Total'], week_index_avg_unit.sum())
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
在这种情况下,我多次使用第一种方法,但这是我第一次将数据分为多个类别,并清楚地知道问题所在的CategoricalIndex
类型。
以下是我的数据格式:
date organic ppc oa other content_partnership total \
0 2018-01-01 379 251 197 51 0 878
1 2018-01-02 880 527 405 217 0 2029
2 2018-01-03 859 589 403 323 0 2174
3 2018-01-04 835 533 409 335 0 2112
4 2018-01-05 760 449 355 272 0 1836
year_month day weekday weekday_name week_index
0 2018-01 1 0 Monday Week 1
1 2018-01 2 1 Tuesday Week 1
2 2018-01 3 2 Wednesday Week 1
3 2018-01 4 3 Thursday Week 1
4 2018-01 5 4 Friday Week 1
以下是代码:
import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt
historicals = pd.read_csv("2018-2019_plants.csv")
# Capture dates for additional date columns
date_col = pd.to_datetime(historicals['date'])
historicals['year_month'] = date_col.dt.strftime("%Y-%m")
historicals['day'] = date_col.dt.day
historicals['weekday'] = date_col.dt.dayofweek
historicals['weekday_name'] = date_col.dt.day_name()
# create week ranges segment (7 day range)
historicals['week_index'] = pd.cut(historicals['day'],[0,7,14,21,28,32], labels=['Week 1','Week 2','Week 3','Week 4','Week 5'])
# Week Index Average (Units)
week_index_avg_unit = historicals[df_monthly_average].groupby(['week_index']).mean().astype(int)
type(week_index_avg_unit.index)
pandas.core.indexes.category.CategoricalIndex
这是week_index_avg_unit
表:
organic ppc oa other content_partnership total day weekday
week_index
Week 1 755 361 505 405 22 2027 4 3
Week 2 787 360 473 337 19 1959 11 3
Week 3 781 382 490 352 18 2006 18 3
...
答案 0 :(得分:0)
pd.CategoricalIndex
是一种特殊的动物。它是一成不变的,因此,要完成此技巧,您可能需要使用诸如pd.CategoricalIndex.set_categories
之类的东西来添加新类别。
查看熊猫文档:https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.CategoricalIndex.html