在熊猫数据框的索引中拆分值

时间:2019-03-12 14:16:05

标签: python pandas split

我有一个这种类型的数据框:

CREATE OR REPLACE PROCEDURE insert_or_upd_movement_baselines_planned_weight_proc (
     p_id                 IN VARCHAR2,
     p_date               IN DATE,
     p_planned_col_name   IN VARCHAR2,
     p_planned_value      IN NUMBER
)
     AS
  plsql_block   VARCHAR2(4000);
     BEGIN
plsql_block := 'merge into MOVEMENT_BASELINES mb using 
 ( select :id as movement_id,:dt as movement_date from dual
  ) s ON ( mb.movement_id = s.movement_id  
              and mb.movement_date = s.movement_date )
     when matched then update set '
          || p_planned_col_name || ' = ' || p_planned_value || 
 ' when not matched then insert (MOVEMENT_ID, MOVEMENT_DATE,'
          || p_planned_col_name || ')
       values (:id,:dt,:value)';

EXECUTE IMMEDIATE plsql_block
              USING p_id,p_date,p_id,p_date,p_planned_value;

END insert_or_upd_movement_baselines_planned_weight_proc;
/

哪个返回

d = {'a': [100,150,180,190]}
df = pd.DataFrame(data=d, index=[(2010,1) ,(2010,2 ), (2011,1) ,(2011,2 )])

我的范围是拆分索引中的值,并通过保留索引信息使数据框更具可读性。换句话说,我的预期结果是:

Out[91]: 
             a
(2010, 1)  100
(2010, 2)  150
(2011, 1)  180
(2011, 2)  190

有什么帮助吗?

1 个答案:

答案 0 :(得分:1)

您可以通过索引选择每个元组的值,最后通过DataFrame.reset_indexdrop=True创建默认索引:

df['year'] = df.index.str[0]
df['class'] = df.index.str[1]
df = df.reset_index(drop=True)
print (df)
     a  year  class
0  100  2010      1
1  150  2010      2
2  180  2011      1
3  190  2011      2

另一个想法是创建新的DataFrame并加入到原始版本:

df1 = pd.DataFrame(df.index.tolist(), columns=['year','class'], index=df.index)
df = df.join(df1).reset_index(drop=True)
print (df)
     a  year  class
0  100  2010      1
1  150  2010      2
2  180  2011      1
3  190  2011      2

另一个想法是由MultiIndex.from_tuples创建MultiIndex

df.index = pd.MultiIndex.from_tuples(df.index, names=['year','class'])
print (df)
              a
year class     
2010 1      100
     2      150
2011 1      180
     2      190

然后可能创建列:

df = df.reset_index()
print (df)
   year  class    a
0  2010      1  100
1  2010      2  150
2  2011      1  180
3  2011      2  190