过滤,替换python图中的组值

时间:2018-08-21 14:26:30

标签: python-3.x pandas plot jupyter-notebook

原始数据框

             Country  Gender  Arr-Dep  Year  Value
0            Austria    Male  IN  1974  13728
1            Austria    Male  OUT  1974  17977
2            Austria  Female  IN  1974   8541
3            Austria  Female  OUT  1974   8450
4            Austria   Total  IN  1974  22269
5            Austria   Total  OUT  1974  26427
6            Belgium    Male  IN  1974   2412
7            Belgium    Male  OUT  1974   2800
8            Belgium  Female  IN  1974   2105
9            Belgium  Female  OUT  1974   2100
10           Belgium   Total  IN  1974   4517

开始我的代码中 ,我正在使用以下库(在具有离线绘图图的Jupyter笔记本中):

import pandas as pd
import numpy as np
import plotly as py
import plotly.figure_factory as ff
import plotly.graph_objs as go
from IPython import display
import os
py.offline.init_notebook_mode()

然后 ,为了避免出现任何错误,我将'-'值替换为0,并按所需的列(年份)进行分组:< / p>

#Replace non numerical values from the Value column
df1['Value'] = df1['Value'].replace('-', np.nan)

#Groupby Country
df1 = df1.groupby(['Year'], as_index=False)['Value'].sum()

然后 ,我使用绘图创建图形:

#Plot everything in a graph
py.offline.iplot({
    "data": [go.Line(x=df1.Year,
            y=df1.Value)],
    "layout": go.Layout(title="Immigration through the years")
}) 

我的问题是...为了过滤/替换值或groupby,可以更改创建图形的最后一位吗?然后,在创建图形之前,我可以摆脱2个步骤。

1 个答案:

答案 0 :(得分:1)

您的方法似乎已经是正确且清洁的方法!

涉及replacegroupBy的两行是数据准备步骤。最后一步是可视化(或数据表示)步骤。将它们分开可以使您的代码更具可读性!

此外,涉及replacegroupBy的两行不能合并,因为它涉及到修改一行并在另一行上进行聚合。