我创建了一个dicts的字典,其结构是关键是部门(' ABC')然后日期(01.08)是关键,值是{product name( A),单位(0),收入(0)}。这种结构在几个部门仍在继续。请参阅下面的dict打印输出。
'ABC': 01.08 \
A. Units 0
Revenue 0
B. Units 0
Revenue 0
C. Units 0
Revenue 0
D. Units 0
Revenue 0
此外,我使用groupby和聚合函数(总和)创建了一个数据框,以获得每个部门每天的单位和收入总和(这是两个级别的聚合,而不是dict中的三个级别 - 日期,部门,产品)。
打印出df,即单位数和总收入的汇总,结果为:
print df.ix['ABC']
Total Overall Units \
dates
2016-08-01 2
2016-08-02 0
2016-08-03 2
2016-08-04 1
2016-08-22 2
Total Overall Revenue \
dates
2016-08-01 20
2016-08-02 500
2016-08-03 39
2016-08-04 50
我目前最终有两个单独的对象,我想合并/追加这些对象,以便将总单位和总收入添加到正确位置的dict末尾(即映射到正确的部门和日期) 。
目前我正在打印dict,然后单独打印数据框pd.to html
由部门'所以我留下了两张单独的桌子。它们不仅是分开的,而且从df创建的表也只有少一列,因为它们的分组不同。
'ABC':
01.08 | 02.08 | 03.08 | 04.08
A Total Units 0 0 0 0
Total Revenue 0 0 0 0
B Total Units 0 0 0 0
Total Revenue 0 0 0 0
C Total Units 0 0 0 0
Total Revenue 0 0 0 0
D Total Units 0 0 0 0
Total Revenue 0 0 0 0
Total Overall Units 0 0 0 0
Total Overall Revenue 0 0 0 0
有什么想法吗?
答案 0 :(得分:0)
跳至问题2:我建议使用单个数据帧来存储您的所有信息。与将字典数据保存在dicts字典中相比,使用起来要容易得多。将日期设置为主索引,并为每个字段使用单独的列('deptA-revenue')或使用多索引。然后,您可以将每日总计作为列存储在同一数据框中。
答案 1 :(得分:0)
要按所需顺序打印,您需要转置行和列。日期字典中的列。这样做可能最容易排成行。这使得您提到的第二个对象变得不必要。除了格式化之外,这样的东西应该有效:
for dept, dates in df.items():
# Transpose the rows and columns into two new dictionaries
# called units and revenue. At the same time, total the
# units and revenue into two new "zztotal" entries.
units = { "zztotal" : {}}
revenues = { "zztotal" : {}}
for date, products in dates.items():
for product, stats in products.items():
name = stats["name"]
if not name in units:
units[name] = {}
revenues[name] = {}
units[name][date] = stats["units"]
revenue[name][date] = stats["revenue"]
if not date in units["zztotal"]:
units["zztotal"][date] = 0
revenue["zztotal"][date] = 0
units["zzotal"][date] += stats["units"]
revenue["zzotal"][date] += stats["revenue"}
# At this point we are ready to print the transposed
# dictionaries. Work is needed to line up the columns
# so the printout is attractive.
print dept
print sorted(dates.keys())
for name, dates in sorted(units.items()):
if name != "zztotal":
print name, "Total Units", [
units[date] for date in sorted(dates)]
print "Total Revenue", [
revenue[date] for date in sorted(dates)]
else:
print "Total Overall Units", [
units[date] for date in sorted(dates)]
print "Total Overall Revenue", [
revenue[date] for date in sorted(dates)]