我有一个循环来为pandas中DF的每一列生成图。我使用Ipython,但是这些图都显示在循环的末尾,而不是在我希望根据我的代码显示它们的地方。
我如何强制ipython / pandas在我有“情节”功能的精确点显示cols?
def explore(file, sep=";", top = 5, k='Code Agence'):
"""
"""
%matplotlib inline
import time
import matplotlib.pyplot as plt
import pandas as pd
import time
import sys
dataframes_top = []
start = time.time()
#print "Exploring :", get_file_name(file), "with %s lines"%(top)
to_explore = pd.read_csv(file, sep=";", error_bad_lines=False)
cols = to_explore.columns
i = -1
for col in cols:
i +=1
serie = to_explore[col]
try:
print"plotting %s"%(col)
serie.plot().show()
time.sleep(2)
except Exception as e:
"plotting issue :%s"%(e)
#serie.index = index
null = serie.isnull()
not_null = len([x for x in null if not x])
r = not_null/len(serie)
s = serie.value_counts()#return value as index, count as value
pct_top = s.values[:top]/not_null
serie_top_n = pd.Series(s.values[:top],index=s.index[:top])
local_df = pd.DataFrame()
local_df[col]=serie_top_n
local_df['pct']=pct_top
somme = local_df['pct'].sum()
pct_2_top= s.values[:top*2]/not_null
serie_2_top_n = pd.Series(s.values[:top*2],index=s.index[:top*2])
local_df_2_top = pd.DataFrame()
local_df_2_top[col]=serie_2_top_n
local_df_2_top['pct']=pct_2_top
somme_2_top = local_df_2_top['pct'].sum()
print
print "%s : [col %s = %s ] "%(get_file_name(file), i,col)
print
print "%.2f"%(r), " pct not null"
print "%.2f pct on the first %s "%(somme, top)
print "%.2f pct on the first %s "%(somme_2_top, 2*top)
print "plot :"
print pd.DataFrame(serie.describe()).T
print
print local_df.T
print "plot :"
local_df.plot()
print "="*100
dataframes_top.append(local_df)
elapsed = time.time()-start
print "="*20, elapsed, "for %s lines"%(len(serie)),"="*20
sys.stdout.flush()
答案 0 :(得分:0)
每次绘制新图表时,请务必致电plt.show()
。如果你不这样做,iPython会自动缓冲每个图,并在你到达单元格的末尾时显示它们。我想你在循环结束时忘记这样做了。
这是一些代码的示例,它将在循环中正确绘制图形而不是等到最后:
%matplotlib inline
import matplotlib.pyplot as plt
import random
from pandas import Series
from numpy.random import randn
for i in range(5):
print("Before graph {0}".format(i))
ts = Series(randn(1000), index=date_range('1/1/2000', periods=1000))
ts = ts.cumsum()
ts.plot()
plt.show()
print("After graph {0}".format(i))
如果我运行此选项,则根据需要在打印输出之间显示每个图。
我使用Python 3使用IPython笔记本版本3.0.0-f75fda4。