我基本上有一列包含邮政编码,另一列包含邻域,并且在邮政编码列中填充了一些空值。因此,首先我找到了与该邮政编码相对应的邻域。其次,我发现了附近最常见的邮政编码。
下面是一些邻域X的邮政编码。这个特定邻域的模式是Y。我想做的是,将在邻域列下具有邻域X的行填入相应的邮政编码列,其值为空。
这是邻域X的模式。它返回实际模式(BS8)和有关邻域X的所有邮政编码的完整列表
<bound method Series.mode of 25 BS8
1904 BS1
1919 BS8
2070 BS1
2083 BS1
2099 NaN
2105 BS1
2228 NaN
2256 BS1
2265 BS8
2285 BS8
2298 BS8
因此,在这种情况下,我想用HH最常见的邮政编码类型填充邮政编码下的nan值。
neighbourhood Postcode
WH BS9
SB BS9
HF BS9
WH BS9
WH BS9
SB BS9
HH nan
SGTH nan
如果HH最常见的邮政编码是Z,如果要在相应的邮政编码中填写Z,例如:
neighbourhood Postcode
WH BS9
SB BS9
HF BS9
WH BS9
WH BS9
SB BS9
HH Z
SGTH nan
在线查看后,我尝试了类似下面的代码,但是没有用。
airbnb.postcode = airbnb.apply(
lambda row: "BS8 " if (airbnb.neighbourhood=="HH" & airbnb.postcode== np.NaN) else row.postcode )
答案 0 :(得分:2)
使用np.select
数据:
# df2:
# neighbourhood Postcode
# 0 WH BS9
# 1 SB BS9
# 2 HF BS9
# 3 WH BS9
# 4 WH BS9
# 5 SB BS9
# 6 HH BS8
# 7 SGTH NaN
conditions = [
((df2['neighbourhood'] == 'HH') & (df2['Postcode'].isna())),
]
choices = [
'BS8'
]
df2['Postcode'] = np.select(conditions, choices, df2['Postcode'])
neighbourhood Postcode
0 WH BS9
1 SB BS9
2 HF BS9
3 WH BS9
4 WH BS9
5 SB BS9
6 HH BS8
7 SGTH NaN