我得到了以下DataFrame:
DATE STG #TIME HRD SZ AREA BEAU PSD EFFORT TYPE NORTHING EASTING SEASON BOAT ASSOC. P/S
0 2016-04-06 1 1025 12 W LANTAU 2 58 ON HKCRP 813713 802792 SPRING NONE S
1 2016-04-06 2 1113 3 W LANTAU 4 27 ON HKCRP 806173 802043 SPRING NONE S
2 2016-04-06 3 1345 2 SW LANTAU 2 ND OFF HKCRP 805606 803300 SPRING NONE NaN
当我做的时候
#remove space in content
df_obj = sighting.select_dtypes(['object'])
df_obj
sighting[df_obj.columns] = df_obj.apply(lambda x: x.str.strip())
它删除PSD列中的值并将它们设为NaN。这是为什么?我该如何解决?谢谢!
答案 0 :(得分:2)
我敢打赌,对象列PSD有混合类型的字符串和整数:
In [11]: df_obj.PSD.values
Out[11]: array([58, 27, 'ND'], dtype=object)
In [12]: df_obj.apply(lambda x: x.str.strip())
Out[12]:
DATE SZ AREA PSD EFFORT TYPE SEASON BOAT ASSOC.P/S
0 2016-04-06 W LANTAU NaN ON HKCRP SPRING NONE S
1 2016-04-06 W LANTAU NaN ON HKCRP SPRING NONE S
2 2016-04-06 SW LANTAU ND OFF HKCRP SPRING NONE NaN
您可以通过将所有对象列强制为字符串来解决此问题:
In [13]: df_obj.astype("str").apply(lambda x: x.str.strip())
Out[13]:
DATE SZ AREA PSD EFFORT TYPE SEASON BOAT ASSOC.P/S
0 2016-04-06 W LANTAU 58 ON HKCRP SPRING NONE S
1 2016-04-06 W LANTAU 27 ON HKCRP SPRING NONE S
2 2016-04-06 SW LANTAU ND OFF HKCRP SPRING NONE nan
注意:您可以看到这并不完美,因为NaN
已转换为'nan'
...您可以解决此问题。虽然我怀疑有更好的方式:
In [21]: df_obj.where(df_obj.isnull(), df_obj.astype("str")).apply(lambda x: x.str.strip())
Out[21]:
DATE SZ AREA PSD EFFORT TYPE SEASON BOAT ASSOC.P/S
0 2016-04-06 W LANTAU 58 ON HKCRP SPRING NONE S
1 2016-04-06 W LANTAU 27 ON HKCRP SPRING NONE S
2 2016-04-06 SW LANTAU ND OFF HKCRP SPRING NONE NaN