根据另一列中的特定值仅替换df列中的特定值

时间:2020-09-24 12:46:51

标签: python pandas if-statement replace

我有以下datframe:

>>> name   ID     geom                                                geometry_error
0  Lily   1234  POLYGON ((5.351418786 7.471461148, 5.352018786...     overlap
1  Pil    3248  POLYGON ((7.351657486 9.341445548, 1.346718786...     overlap
2  Poli   9734  -                                                     -
0  Lily   1234  POLYGON ((5.351265486 2.471876538, 6.33355018786...   overlap

我要“编辑” geometry_erro列,条件是如果geom值为'-',则几何错误值将为“ no geometry”,例如:

>>> name   ID     geom                                                geometry_error
0  Lily   1234  POLYGON ((5.351418786 7.471461148, 5.352018786...     overlap
1  Pil    3248  POLYGON ((7.351657486 9.341445548, 1.346718786...     overlap
2  Poli   9734  -                                                     no geometry
0  Lily   1234  POLYGON ((5.351265486 2.471876538, 6.33355018786...   overlap

我试图这样做:

def gg(row):
    if row['geom'] == '-':
        val = 'no geometry generated'   
    return val

df['geometry errors'] = df.apply(gg, axis=1)

>>>UnboundLocalError: local variable 'val' referenced before assignment

我不明白为什么会收到此错误,因为我在同一脚本的不同函数中使用了此varuabke名称val,那么为什么现在我会收到此错误?也许还有更好的方法吗?

3 个答案:

答案 0 :(得分:2)

使用它,很好而且很简单。 np.where正在为您做测试。

代码:

import numpy as np

# ...

df['geometry_error'] = np.where(df['geom'] == '-', 
                                'no geometry generated', 
                                df['geometry_error'])

输出:

   name    ID                                               geom  \
0  Lily  1234   POLYGON ((5.351418786 7.471461148, 5.352018786))   
1   Pil  3248   POLYGON ((7.351657486 9.341445548, 1.346718786))   
2  Poli  9734                                                  -   
3  Lily  1234  POLYGON ((5.351265486 2.471876538, 6.333550187...   

          geometry_error  
0                overlap  
1                overlap  
2  no geometry generated  
3                overlap

答案 1 :(得分:0)

df[df['geom'] == '-']['geometry_error'] = 'no geometry generated'

答案 2 :(得分:0)

几种方法:

  1. 用“无几何图形”替换geometery_error的所有空例
df['geometry_error'] = df['geometry_error'].fillna('no geometry')
  1. 找到所有geom =='-'的行,并将其geometry_error设置为'no geometry'
df.loc[df['geom'] == '-', 'geometry_error'] = 'no geometry'

我认为您的函数无法正常工作,因为您需要更改return语句上的缩进:

def gg(row):
    if row['geom'] == '-':
        val = 'no geometry generated'   
        return val