我有一个便利函数,可以按类别变量(1)的级别生成计数的漂亮摘要(2)。
这是产生计数的步骤(1):
import plotly.plotly as py
stringCol = list(df.select_dtypes(include=['object'])) # list object of categorical variables
dfs_ct = [df[c] # dataframe of counts
.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=False) # generate matrix data
.rename_axis(mapper=c, axis=0, copy=True, inplace=False) # rename columns
.to_frame(name='count') # create column name, "count"
.applymap("{:,}".format) # add thousands separator
for c in stringCol]
步骤(2)创建一个漂亮的摘要,并排显示每个分类变量级别的计数:
# create a helper function that takes pd.dataframes as input and outputs pretty, compact EDA results
from IPython.display import display_html
def display_side_by_side(*args):
html_str = ''
for df in args:
html_str = html_str + df.to_html()
display_html(html_str.replace('table','table style="display:inline"'),raw=True)
示例输出:
我将如何修改步骤(2)以在计数以下生成Seaborn直方图?