python,当没有行时向右移动该行,并通过回填来填充

时间:2018-07-08 07:46:59

标签: python pandas dataframe

数据:

         year      all deceased living   data
0        2018    7,107    4,394  2,713   None
1        2017   16,478   10,286  6,192   None
2        2016   15,944    9,971  5,973   None
3 Alabama    To Date        5,926   3,471   2,455 
124      1990       85       49     36   None
125      1989       80       57     23   None
126      1988       86       68     18   None
127  Arkansas  To Date    2,963  1,931  1,032
128      1989       16       12      4   None
129      1988       16       11      5   None

我想检测data = None的行,将这些行向右移动一列,使第一列丢失,然后通过向后填充来填充它。

结果:

    state     year      all deceased living   
0   None     2018    7,107    4,394  2,713   
1   None     2017   16,478   10,286  6,192   
2   None     2016   15,944    9,971  5,973   
3 Alabama    To Date        5,926   3,471   2,455 
124 Alabama     1990       85       49     36   
125 Alabama     1989       80       57     23   
126 Alabama     1988       86       68     18   
127 Arkansas  To Date    2,963  1,931  1,032
128 Arkansas     1989       16       12      4   
129 Arkansas     1988       16       11      5  

最后,我将删除year = To Date的行,使其成为正式的数据集。

谢谢。

1 个答案:

答案 0 :(得分:0)

这里涉及一些步骤。我认为,最简单的方法是先定义您的state系列,然后删除子标题行,然后作为最后一步,将适当的列转换为数字。

import numpy as np
import locale

# set locale, for converting strings with commas to integers
locale.setlocale(locale.LC_NUMERIC, '')

# define state and front fill
df['state'] = np.where(pd.to_numeric(df['year'], errors='coerce').isnull(),
                       df['year'], np.nan)
df['state'] = df['state'].ffill()

# drop To Date rows and data column
df = df[~(df['all'] == 'To Date')].drop('data', 1)

# convert data to numeric
num_cols = ['year', 'all', 'deceased', 'living']
df[num_cols] = df[num_cols].applymap(locale.atoi)

结果

print(df)

     year    all  deceased  living     state
0    2018   7107      4394    2713       NaN
1    2017  16478     10286    6192       NaN
2    2016  15944      9971    5973       NaN
124  1990     85        49      36   Alabama
125  1989     80        57      23   Alabama
126  1988     86        68      18   Alabama
128  1989     16        12       4  Arkansas
129  1988     16        11       5  Arkansas