首先添加行值,然后根据标签子集删除重复行

时间:2017-08-16 04:47:17

标签: pandas duplicates pivot-table add

我有这个代码,它应该适合你。

import pandas as pd
import urllib.request
import numpy as np

url="https://www.misoenergy.org/Library/Repository/Market%20Reports/20170811_da_bc.xls"
cnstxls = urllib.request.urlopen(url)
xl = pd.ExcelFile(cnstxls)
df = xl.parse("Sheet1",skiprows=3)
constr = df.iloc[:,1:7]
constr['Class'] = np.where(constr['Hour of Occurrence'].isin([1,2,3,4,5,6]), 'Offpeak', 'Onpeak')
summary=pd.pivot_table(constr, index=['Constraint_ID','Constraint Name'],values='Shadow Price', columns=['Class'], aggfunc=np.sum)

这是上面代码生成的输出部分:

Class                                                Offpeak  Onpeak
Constraint_ID Constraint Name                                       
1049          EAU CLA TR9 FLO EAU CLR XF10            -46.52 -364.68
1607          OTTUMWA-WAPELLO FLO HILLS-MONTEZUM       -2.60 -237.36
285770        DKSN-MATTHSON FLO BELFLD-CHRLIE CK         NaN  -59.53
              MATTHSON MATTHDKSN_11_1 1 WAU34028       43.66     NaN
              MATTHSON_MATTHDKSN_11_1_1_LN              6.55     NaN
287090        BAKER2_TR11_TR11_XF                      11.78    1.63
289484        BAKER2 TR12 TR12 WAUMDU13                  NaN   -4.52
              BAKER2_TR12_TR12_XF                     -10.41     NaN

我想实现以下目标:

1) Add the values for Offpeak and Onpeak columns where Constraint_ID is same. For example:Constraint_ID=285770 has three different Constraint Names and corresponding values. 
2) Drop duplicate Constraint_IDs keeping the first Constraint Name
3) Create a third column that adds OffPeak and Onpeak

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:1)

IIUC:

df_out = summary.reset_index().groupby('Constraint_ID')\
                 .agg({'Constraint Name':'first','Offpeak':'sum','Onpeak':'sum'})

df_out['Total'] = df_out['Offpeak'].add(df_out['Onpeak'],fill_value=0))

输出:

                                     Constraint Name  Offpeak  Onpeak   Total
Constraint_ID                                                                
1049                    EAU CLA TR9 FLO EAU CLR XF10   -46.52 -364.68 -411.20
1607              OTTUMWA-WAPELLO FLO HILLS-MONTEZUM    -2.60 -237.36 -239.96
4636              KNOXVIL-LUCAS FLO BEACON-TRICOUNTY      NaN -276.62 -276.62
4862                               NEAST_K-11_2_2_LN      NaN   28.03   28.03
9712                             VVWNSP TR1 TR1 BASE      NaN    0.84    0.84
98333          HORN-TRENTONC FLO NAVARRE 230/120 201      NaN -112.05 -112.05
107318          LOSTDAU-REDMAPL FLO HIGHWAY22-MORGAN      NaN  -96.10  -96.10
188144                    EQIN EQIN-HAM-4 A ALW16X45    26.32     NaN   26.32
201082              BOGLSA AT3 FLO FRANKLIN-MCKNIGHT      NaN -245.44 -245.44
248010                     OLIVER OLIVROWEND1 1 BASE      NaN    1.48    1.48
267644           DANVL3-DODSON FLO MT OLIVE-LAYFIELD      NaN  -73.25  -73.25
268470                    HENSEL_HENSEDRAYT11_1_1_LN      NaN    1.14    1.14
270366                LMR-PONDER FLO CONROE BLK-PNDR      NaN  -80.71  -80.71
270888            DUMAS-REEDX FLO WOLF CREEK-MCADAMS      NaN  -19.68  -19.68
281292             NSES-RAM452 FLO BLACKBERRY-NEOSHO   -33.80 -780.05 -813.85
285770            DKSN-MATTHSON FLO BELFLD-CHRLIE CK    50.21  -59.53   -9.32
287090                           BAKER2_TR11_TR11_XF    11.78    1.63   13.41
289484                     BAKER2 TR12 TR12 WAUMDU13   -10.41   -4.52  -14.93