我有一个这种类型的数据框:
CREATE OR REPLACE PROCEDURE insert_or_upd_movement_baselines_planned_weight_proc (
p_id IN VARCHAR2,
p_date IN DATE,
p_planned_col_name IN VARCHAR2,
p_planned_value IN NUMBER
)
AS
plsql_block VARCHAR2(4000);
BEGIN
plsql_block := 'merge into MOVEMENT_BASELINES mb using
( select :id as movement_id,:dt as movement_date from dual
) s ON ( mb.movement_id = s.movement_id
and mb.movement_date = s.movement_date )
when matched then update set '
|| p_planned_col_name || ' = ' || p_planned_value ||
' when not matched then insert (MOVEMENT_ID, MOVEMENT_DATE,'
|| p_planned_col_name || ')
values (:id,:dt,:value)';
EXECUTE IMMEDIATE plsql_block
USING p_id,p_date,p_id,p_date,p_planned_value;
END insert_or_upd_movement_baselines_planned_weight_proc;
/
哪个返回
d = {'a': [100,150,180,190]}
df = pd.DataFrame(data=d, index=[(2010,1) ,(2010,2 ), (2011,1) ,(2011,2 )])
我的范围是拆分索引中的值,并通过保留索引信息使数据框更具可读性。换句话说,我的预期结果是:
Out[91]:
a
(2010, 1) 100
(2010, 2) 150
(2011, 1) 180
(2011, 2) 190
有什么帮助吗?
答案 0 :(得分:1)
您可以通过索引选择每个元组的值,最后通过DataFrame.reset_index
和drop=True
创建默认索引:
df['year'] = df.index.str[0]
df['class'] = df.index.str[1]
df = df.reset_index(drop=True)
print (df)
a year class
0 100 2010 1
1 150 2010 2
2 180 2011 1
3 190 2011 2
另一个想法是创建新的DataFrame
并加入到原始版本:
df1 = pd.DataFrame(df.index.tolist(), columns=['year','class'], index=df.index)
df = df.join(df1).reset_index(drop=True)
print (df)
a year class
0 100 2010 1
1 150 2010 2
2 180 2011 1
3 190 2011 2
另一个想法是由MultiIndex.from_tuples
创建MultiIndex
:
df.index = pd.MultiIndex.from_tuples(df.index, names=['year','class'])
print (df)
a
year class
2010 1 100
2 150
2011 1 180
2 190
然后可能创建列:
df = df.reset_index()
print (df)
year class a
0 2010 1 100
1 2010 2 150
2 2011 1 180
3 2011 2 190