如何从描述中提取数字并将其设置到熊猫数据框的另一列中

时间:2019-10-15 02:26:24

标签: python pandas

我有一个类似上表的熊猫数据框。

|------------------------------------|
| ID |      Description       | area |
|------------------------------------|
| 1  | House with 80m2        | NaN  |
|------------------------------------|
| 2  | House with 100 meters  | NaN  |
|------------------------------------|
| 3  | House with 90 m2       | 90   |
|------------------------------------| 

我必须从描述列中提取数字信息,并在值是NaN时插入到区域中。

|------------------------------------|
| ID |      Description       | area |
|------------------------------------|
| 1  | House with 80m2        |  80  |
|------------------------------------|
| 2  | House with 100 meters  |  100 |
|------------------------------------|
| 3  | House with 90 m2       |  90  |
|------------------------------------| 

有人可以帮助我吗?

2 个答案:

答案 0 :(得分:1)

假设每个描述中只有一个数字(整数),请使用np.where + str.extract

df['area'] = np.where(pd.isna(df.area), df.Description.str.extract('(\d+)'), df.area)
print(df)

输出

   ID            Description area
0   1        House with 80m2   80
1   2  House with 100 meters  100
2   3       House with 90 m2   90

答案 1 :(得分:0)

要留在熊猫里试试

df['area'] = df['area'].fillna(df['des'].str.extract('(\d+)')[0])