如何用熊猫另一列的范围扩展数据表

时间:2019-08-05 13:34:50

标签: python python-3.x pandas

我有以下数据表

import pandas as pd
  dt = pd.DataFrame({'id_audience': ['Female 13-17', 'Female 18-20'],
                       'gender': ['female', 'female'],
                       'age_min': [13, 18],
                       'age_max': [17, 20]})

我想扩展此数据框,以增加一列(age),并且age应该在age_minage_max之间。

最终结果如下:

 dt = pd.DataFrame({'id_audience': ['Female 13-17', 'Female 13-17', 'Female 13-17', 'Female 13-17',
                                   'Female 13-17', 'Female 18-20', 'Female 18-20', 'Female 18-20', ],
                   'gender': ['female', 'female', 'female', 'female', 'female', 'female', 'female', 'female'],
                   'age_min': [13, 13, 13, 13, 18, 18, 18, 18],
                   'age_max': [17, 17, 17, 17, 20, 20, 20, 20],
                   'age': [13, 14, 15, 16, 17, 18, 19, 20]})

有什么想法吗?

3 个答案:

答案 0 :(得分:4)

也可以像{Wen一样使用explode,但在“最小/最大年龄”列上可以直接访问范围


dt.assign(
  age=[np.arange(x, y+1) for x, y in zip(dt['age_min'], dt['age_max'])]
).explode('age').reset_index(drop=True)

    id_audience  gender  age_min  age_max age
0  Female 13-17  female       13       17  13
1  Female 13-17  female       13       17  14
2  Female 13-17  female       13       17  15
3  Female 13-17  female       13       17  16
4  Female 13-17  female       13       17  17
5  Female 18-20  female       18       20  18
6  Female 18-20  female       18       20  19
7  Female 18-20  female       18       20  20

答案 1 :(得分:3)

这是使用新熊猫0.25.0 explode

的一种方法
s=dt['id_audience'].str.extractall('(\d+)')

dt['age']= [list(range(y.iloc[0,0],y.iloc[1,0]+1)) for x , y in s.astype(int).groupby(level=0)]
dt=dt.explode('age').reset_index(drop=True)

答案 2 :(得分:2)

使用Index.repeatGroupBy.cumcount作为age列的计数器:

dt = dt.loc[dt.index.repeat(dt['age_max'] - dt['age_min'] + 1)]
dt['age'] = dt['age_min'] + dt.groupby(level=0).cumcount()
dt = dt.reset_index(drop=True)
print (dt)
    id_audience  gender  age_min  age_max  age
0  Female 13-17  female       13       17   13
1  Female 13-17  female       13       17   14
2  Female 13-17  female       13       17   15
3  Female 13-17  female       13       17   16
4  Female 13-17  female       13       17   17
5  Female 18-20  female       18       20   18
6  Female 18-20  female       18       20   19
7  Female 18-20  female       18       20   20