DataFrame的高级重塑

时间:2017-05-10 18:26:51

标签: python pandas

我在右边有数据框,想要将其重新整形到左边的数据框中,并进行数据映射,如下图所示。

enter image description here

我无法做到这一点,我用Unstack()尝试了很多东西并且旋转,但他们没有给出我想要的结果。

# coding: utf-8

# In[2]:

import requests
import pandas as pd
import numpy as np
import json
import json
from pandas.io.json import json_normalize

# In[3]:Load my Source data : jobDataFrame
## Load my Source data : jobDataFrame
jobDataFrame=pd.read_excel('datasource.xlsx',0)

jobDataFrame=pd.DataFrame.from_dict(

{'Item Value': {(1, 'N1', 'Nb Tech'): 1,
  (1, 'N1', 'Quantity'): 1000,
  (1, 'N1', 'Subtype'): 'AR',
  (1, 'N1', 'Type'): '2FO',
  (1, 'N1', 'item1'): '2.5',
  (1, 'N2', 'Item1'): 1,
  (1, 'N2', 'Item2'): 0,
  (1, 'N2', 'Item3'): 3,
  (1, 'N2', 'Nb Tech'): 2,
  (1, 'N3', 'Item1'): 2,
  (1, 'N3', 'Nb Tech'): 3,
  (1, 'N3', 'Quantity'): 500,
  (1, 'N3', 'Subtype'): 'INNER',
  (1, 'N3', 'Type'): 'DIAM1',
  (2, 'N1', 'Nb Tech'): 2,
  (2, 'N1', 'Quantity'): 200,
  (2, 'N1', 'Subtype'): 'AR',
  (2, 'N1', 'Type'): '12FO',
  (2, 'N1', 'item1'): 3,
  (2, 'N3', 'Nb Tech'): 1,
  (2, 'N3', 'Quantity'): 800,
  (2, 'N3', 'Subtype'): 'INNER',
  (2, 'N3', 'Type'): 'DIAM2',
  (2, 'N3', 'item1'): 2}})

jobDataFrame.set_index(['JobNature','Report Items Name']).to_dict()
# In[4]:
## lOAD MY Prince list

princelist=pd.read_excel('datasource.xlsx',1,index_col=[0,1,2])
princelist
princelist.to_dict()

#Tthis part is to validate that the shaped that I imagine will work for me.

# In[5]:
reshaped=pd.read_excel('datasource.xlsx','reshapedJobDataFrame',index_col=[0,1,2])
reshaped
reshaped.reset_index(['Report Items Name (or Grouped itemp name)','Job N°'],inplace=True)
reshaped.set_index(['Type','Subtype'],append=True)

# In[150]:

def fuc(x):
    if (x['Type']=='item1' or x['Type']=='item2' or x['Type']=='item2') :
        return x['Nb Tech']*x['Value']
    else: return x['Value']


result=pd.merge(reshaped.reset_index(),princelist.reset_index() , how='left', on=['JobNature','Type','Subtype'] )

result.Qte=(result.reset_index())[['Type','Nb Tech','Value']].apply(fuc,axis=1) #Qte : if Type=='item1'| 'item2' return Nb Techn*Value 
                                                                        #Else: return Value

result.Total=result.Qte*result['U.Price']
result.set_index(['Job N°','JobNature','Type','Subtype'],inplace=True)


result



#Annex: Price list format :
    <bound method DataFrame.to_dict of                         
JobNature Type  Subtype descrition    ref  U.Price  Qte  Total                                      
N1        2FO   AR               …   REF1        1    0      0
                UN               …   REF2        2    0      0
          4FO   AR               …   REF3        3    0      0
                UN               …   REF4        4    0      0
          12FO  AR               …   REF5        5    0      0
                UN               …   REF6        6    0      0
          item1 NaN              …   REF6        6    0      0
N2        item1 NaN              …   REF7        7    0      0
          Item2 NaN              …   REF8        8    0      0
          Item3 NaN              …   REF9        9    0      0
N3        DIAM1 IN               …  REF10       10    0      0
                OUT              …  REF11       11    0      0
          DIAM2 IN               …  REF12       12    0      0
                OUT              …  REF13       13    0      0
          DIAM3 IN               …  REF14       14    0      0
                OUT              …  REF15       15    0      0
          item1 NaN              …  REF16       16    0      0
          Item2 NaN              …  REF17       17    0      0>

0 个答案:

没有答案