在大熊猫中将平均值乘以一列然后再乘以另一列

时间:2016-04-14 04:12:13

标签: python pandas

我有以下数据集:

<header>
  <div id="logo">
    <a href="/">
      <div class="img"></div>
    </a>
  </div>
</header>

首先,我希望按日{月1日}逐月采用 data = {'VALVE_SCORE': {0: 34.1,1: 41.0,2: 49.7,3: 53.8,4: 35.8,5: 49.2,6: 38.6,7: 51.2,8: 44.8,9: 51.5,10: 41.9,11: 46.0,12: 41.9,13: 51.4,14: 35.0,15: 49.7,16: 41.5,17: 51.5,18: 45.2,19: 53.4,20: 38.1,21: 50.2,22: 25.4,23: 30.0,24: 28.1,25: 49.9,26: 27.5,27: 37.2,28: 27.7,29: 45.7,30: 27.2,31: 30.0,32: 27.9,33: 34.3,34: 29.5,35: 34.5,36: 28.0,37: 33.6,38: 26.8,39: 31.8}, 'DAY': {0: 6, 1: 6, 2: 6, 3: 6, 4: 13, 5: 13, 6: 13, 7: 13, 8: 20, 9: 20, 10: 20, 11: 20, 12: 27, 13: 27, 14: 27, 15: 27, 16: 3, 17: 3, 18: 3, 19: 3, 20: 10, 21: 10, 22: 10, 23: 10, 24: 17, 25: 17, 26: 17, 27: 17, 28: 24, 29: 24, 30: 24, 31: 24, 32: 3, 33: 3, 34: 3, 35: 3, 36: 10, 37: 10, 38: 10, 39: 10}, 'MONTH': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 1, 11: 1, 12: 1, 13: 1, 14: 1, 15: 1, 16: 2, 17: 2, 18: 2, 19: 2, 20: 2, 21: 2, 22: 2, 23: 2, 24: 2, 25: 2, 26: 2, 27: 2, 28: 2, 29: 2, 30: 2, 31: 2, 32: 3, 33: 3, 34: 3, 35: 3, 36: 3, 37: 3, 38: 3, 39: 3}} df = pd.DataFrame(data) 。但是,通过将天数分组来取平均值会产生十进制月份。我想保留前几个月mean

groupby('MONTH').mean()

我希望最终结果是:

In [401]: df.groupby("DAY").mean()
Out[401]: 
       VALVE_SCORE  MONTH
DAY                
3    39.7250    2.5
6    44.6500    1.0
10   32.9875    2.5
13   43.7000    1.0
17   35.6750    2.0
20   46.0500    1.0
24   32.6500    2.0
27   44.5000    1.0

2 个答案:

答案 0 :(得分:0)

这是一个可能的解决方案。如果有更有效的方法,请告诉我。

for i = 1:nCol*nRow
    subplot(nRow,nCol,i); imshow(out2(:,:,:,i));
    imwrite(out2(:,:,:,i),[num2str(i) '.jpg']);
end

结果是:

df = pd.DataFrame(data)

months = list(df['MONTH'].unique())

frames = []
for p in months:
  df_part = df[df['MONTH'] == p]
  df_part_avg = df_part.groupby("DAY", as_index=False).mean()
  df_part_avg = df_part_avg.drop('DAY', axis=1)
  frames.append(df_part_avg)

df_months = pd.concat(frames)
df_final = df_months.groupby("MONTH", as_index=False).mean()

答案 1 :(得分:0)

考虑一下您拥有的数据,您希望得到每日均值,然后是月均值。将它们放在Excel数据透视表中将产生如下结果:

enter image description here

在大熊猫中做同样的事情,按月分组足以得到相同的结果:

df.groupby(['MONTH']).mean()
        DAY  VALVE_SCORE
MONTH
1      16.5      44.7250
2      13.5      38.0375
3       6.5      30.8000

由于月份和日期值是数字,因此大熊猫会对其进行处理,请考虑“白天”和“白天”。和&#39; MONTH&#39;值不是数字且是字符串,您得到以下结果:

       VALVE_SCORE
MONTH
1          44.7250
2          38.0375
3          30.8000

所以大熊猫已经计算了每日手段并使用它来计算每月的手段。