比较不同年份的同一月值

时间:2018-07-26 16:18:22

标签: python python-3.x pandas dataframe

I have the below data frames 
  ----  
    df=
     city    code     qty  month   year
     hyd     1        10    1      2016
     hyd     2        12    2      2016
     hyd     3        15    3      2016
     hyd     1        25    1      2017
     hyd     2        15    2      2017
     hyd     3        25    4      2017
     hyd     1        25    1      2018
     hyd     2        15    3      2018
     hyd     3        25    6      2018

    b = 
     city    code     qty  month   year  
     hyd     1        10    1      2016
     hyd     2        12    2      2016
     hyd     3        18    3      2016
     hyd     4        22    4      2016
     hyd     5        10    5      2016
     hyd     6        12    6      2016
     hyd     1        12    1      2017
     hyd     2        12    2      2017
     hyd     3        16    3      2017
     hyd     4        25    4      2017
     hyd     5        10    5      2017
     hyd     6        14    6      2017
     hyd     1        10    1      2018
     hyd     2        12    2      2018
     hyd     3        18    3      2018
     hyd     4        25    4      2018
     hyd     5        10    5      2018
     hyd     6        12    6      2018

我想将df与b进行比较,并单行获取其前几个月的月份数量。年份只能与更少的年份进行比较     比那年。下面是结果数据框。

     resultdf=
     city    code     qty  month   year   qty_2016   qty_2017  qty_2018
     hyd     1        10    1      2016
     hyd     2        12    2      2016
     hyd     3        15    3      2016
     hyd     1        25    1      2017       10
     hyd     2        15    2      2017       12
     hyd     3        25    4      2017       22 
     hyd     1        25    1      2018       10        15
     hyd     2        15    3      2018       18        16        
     hyd     3        25    6      2018       12        14   

下面是代码:

attribute_name = 'city'
attribute_code = 'code'
df1 = df[df['year'].isin(['2018'])]
month_list = df1.month.unique()
feature_list = df1[attribute_name].unique()
code_list = df1[attribute_code].unique()     
for feature_name in feature_list:
    for code_num in code_list:
        for month_num in month_list:
            dff2 =  b[(b['month'].isin([str(month_num)])) & (b['year'].isin(['2018'])) & (b['code'].isin([str(code_num)])) & (b['city'].isin([str(feature_name)]))]
            dff2 = dff2.drop(['year'], axis=1)
            dff2 = dff2.rename(columns={'qty': 'qty_2018'})
            dff =  b[(b['month'].isin([str(month_num)])) & (b['year'].isin(['2017'])) & (b['code'].isin([str(code_num)])) & (b['city'].isin([str(feature_name)]))]
            dff = dff.drop(['year'], axis=1)
            dff = dff.rename(columns={'qty': 'qty_2017'})
            dff1 =  b[(b['month'].isin([str(month_num)])) & (b['year'].isin(['2016'])) & (b['code'].isin([str(code_num)])) & (b['city'].isin([str(feature_name)]))]
            dff1 = dff1.drop(['year'], axis=1)
            dff1 = dff1.rename(columns={'qty': 'qty_2016'})
            df2 = dff2.merge(dff, on=['city','code','month'], how='left')
            df3 = df2.merge(dff1,on=['city','code','month'], how='left' )
            result.append(df3)
frame = pd.concat(result) 
frame['year'] = 2018

以相同的方式,我将重复2017年,我将获得qty_2017和qty_2016作为frame1并联系frame和frame1。

上面的代码为我提供了所需的结果,但是这非常耗时,而且并非所有年份都处于循环状态。我需要以什么方式才能使它变得更好和更快而获得帮助。

0 个答案:

没有答案