用列范围扩展熊猫数据框

时间:2020-02-22 11:53:13

标签: python pandas

我有一个pandas数据框,其列范围和字符串类似于此:

     STREET             LOWADD  HIGHADD POSTAL  SECTOR
0   ABBERLY CIR         1900    2000    23112   A6
1   ABBEY VILLAGE CIR   500     600     23114   B6

我需要将其扩展/转换为LOWADD和HIGHADD列之间的内容,并向前填充STREET,POSTAL和SECTOR中的数据:

New_Street              POSTAL  SECTOR
1901 ABBERLY CIR        23112   A6
1902 ABBERLY CIR        23112   A6
1903 ABBERLY CIR        23112   A6
1904 ABBERLY CIR        23112   A6
1905 ABBERLY CIR        23112   A6

用熊猫做这件事的最好方法是什么?

1 个答案:

答案 0 :(得分:2)

想法是按Series.sub减去重复行数的列,然后按Index.repeatDataFrame.loc重复,最后将GroupBy.cumcount的计数器系列添加到Street列:

df = df.reset_index(drop=True)
diff = df['HIGHADD'].sub(df['LOWADD'])
df = df.loc[df.index.repeat(diff)]
s = df.groupby(level=0).cumcount().add(1).add(df['LOWADD']).astype(str)
df['STREET'] = s + ' ' + df['STREET']
df = df.drop(['LOWADD','HIGHADD'], axis=1).reset_index(drop=True)
print (df)
                    STREET  POSTAL SECTOR
0         1901 ABBERLY CIR   23112     A6
1         1902 ABBERLY CIR   23112     A6
2         1903 ABBERLY CIR   23112     A6
3         1904 ABBERLY CIR   23112     A6
4         1905 ABBERLY CIR   23112     A6
..                     ...     ...    ...
195  596 ABBEY VILLAGE CIR   23114     B6
196  597 ABBEY VILLAGE CIR   23114     B6
197  598 ABBEY VILLAGE CIR   23114     B6
198  599 ABBEY VILLAGE CIR   23114     B6
199  600 ABBEY VILLAGE CIR   23114     B6

[200 rows x 3 columns]