如何通过func同时传递df10和df20(甚至更多的数据帧)并保留其名称以供进一步使用?
import pandas as pd
import numpy as np
df = pd.DataFrame( {
'A': ['d','d','d','d','d','d','g','g','g','g','g','g','k','k','k','k','k','k'],
'B': [5,5,6,4,5,6,-6,7,7,6,-7,7,-8,7,-6,6,-7,50],
'C': [1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2],
'S': [2012,2013,2014,2015,2016,2012,2012,2014,2015,2016,2012,2013,2012,2013,2014,2015,2016,2014]
} );
df10 = (df.B + df.C).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0)
df20 = (df['B'] - df['C']).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0)
def func(df):
df1 = df.groupby(level=0, axis=1).sum()
new_cols= list(zip(df1.columns.get_level_values(0),['total'] * len(df.columns)))
df1.columns = pd.MultiIndex.from_tuples(new_cols)
df2 = pd.concat([df1,df], axis=1).sort_index(axis=1).sort_index(axis=1, level=1)
df2.columns = ['_'.join((col[0], str(col[1]))) for col in df2.columns]
df2.columns = df2.columns.str.replace('sum_','')
df2.columns = df2.columns.str.replace('size_','T')
return df2
编辑,根据请求打印数据框;
打印(DF10) 打印(DF20)
df10:
sum size
S 2012 2013 2014 2015 2016 2012 2013 2014 2015 2016
A
d 13 6 7 5 6 2 1 1 1 1
g -11 8 8 8 7 2 1 1 1 1
k -6 9 48 8 -5 1 1 2 1 1
df20:
sum size
S 2012 2013 2014 2015 2016 2012 2013 2014 2015 2016
A
d 9 4 5 3 4 2 1 1 1 1
g -15 6 6 6 5 2 1 1 1 1
k -10 5 40 4 -9 1 1 2 1 1
打印输出
答案 0 :(得分:4)
编辑:可能有更好的方法来做到这一点;我只是觉得我会提出这个建议。如果没有要求,请告诉我,我会删除。
如何通过func同时传递df10和df20(甚至更多的数据帧)并保留其名称以供进一步使用?
如果您只想通过ggplot(df_ex, aes(x=address,y="",fill=clas)) + #x axis bias voltage dependence
geom_tile() +
scale_fill_manual(values=c('Good'="green","Bad"="Blue","Ugly"="black"))+
facet_wrap(~No,ncol=1,scales = "free_x")+
theme(legend.position = "top",axis.text.y = element_text(size = 20,angle = 90),axis.text.x = element_text(size=12,face="bold",colour = "black"),
axis.title.y = element_text(face="bold",size = 20, colour = "black"),
axis.title.x = element_text(face="bold",size = 20 , colour = "black"),
strip.text = element_text(size=26, face="bold"),
strip.background = element_rect(fill="#FFFF66", colour="black", size=0.5),
plot.title=element_text(face="bold",color="red",size=14),
legend.title = element_text(colour="black", size=26,face="bold"),
legend.text = element_text(colour="black", size=18))+
labs(x = "address",y = "")
传递多个功能,并且所有数据帧的格式相同,则可能会发生以下情况。
为简单起见,请使用数据帧:
func
和一个简单的功能:
df10 = pd.DataFrame({'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]})
df20 = pd.DataFrame({'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]})
df30 = pd.DataFrame({'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]})
创建原始数据框的列表:
your_func(df):
#### Perform some action/change to df eg
df2 = df.head(1)
return df2
然后,使用for循环通过列表传递每个数据帧,例如这将保持原始数据帧不变。
A = [df10,df20,df30]
A = [ one two
0 1.0 4.0
1 2.0 3.0
2 3.0 2.0
3 4.0 1.0,
one two
0 1.0 4.0
1 2.0 3.0
2 3.0 2.0
3 4.0 1.0,
one two
0 1.0 4.0
1 2.0 3.0
2 3.0 2.0
3 4.0 1.0]
输出:
for i in range(0,len(A)):
A[i] = your_func(A[i])
因此,现在列表A = [
one two
0 1.0 4.0,
one two
0 1.0 4.0,
one two
0 1.0 4.0]
包含每个新数据帧。您的原始数据框A
df10
等保持不变。只需调用df20
的元素即可访问您的新数据框。