Question

我有一个带有整数列的pandas数据帧，其中包含一些nans。我想将它们从整数转换为字符串，并将nans替换为＆＃39; not available＆＃39;。

主要原因是因为我需要在该列上运行groupbys，除非我转换nans，否则groupby将摆脱它们！为什么会发生这种情况，以及整个大熊猫社区如何没有兴起，这是一个完全独立的讨论（当我第一次了解它时，我无法相信它......）。

我已经尝试过以下代码，但它不起作用。请注意，我已尝试class Robot: ''' Represents a robot with a name! ''' # A class variable. # Increments every time # an object of this class # is instantiated, so it # counts the instances. # population = 0 def __init__(self, name): self.name = name population += 1 print("Initializing {}".format(self.name) ) r = Robot("Rob")和astype(str)）。在这两种情况下，列都转换为对象，而不是字符串;也许是因为Python假设（错误地，它们在我的数据帧中都具有相同的长度），字符串的长度会变化？但是，最重要的是，fillna（）不起作用，并且nans保持不变！为什么？有什么建议？谢谢！

astype('str'

Answer 1

将这些值转换为'str'后，

fillna将不起作用，该列中不再有np.nan，而是字符串值'nan'：

df= pd.DataFrame(np.random.randint(1,10,(10000,5)), columns=['a','b','c','d','e'])
df.iloc[0,0]=np.nan
#df['a']=df['a'].astype(str) <-- You don't need this line.
df['a']=df['a'].fillna('not available')
print(df.dtypes)
print(df.head())

输出：

a    object
b     int32
c     int32
d     int32
e     int32
dtype: object
               a  b  c  d  e
0  not available  6  3  9  7
1              5  4  5  5  3
2              4  2  5  3  2
3              4  9  2  8  3
4              2  6  5  9  1

Answer 2

df= pd.DataFrame(np.random.randint(1,10,(10,5)), columns=['a','b','c','d','e'])
df.iloc[0,0]=np.nan

df.isnull()
Out[329]: 
       a      b      c      d      e
0   True  False  False  False  False
1  False  False  False  False  False
2  False  False  False  False  False
3  False  False  False  False  False
4  False  False  False  False  False
5  False  False  False  False  False
6  False  False  False  False  False
7  False  False  False  False  False
8  False  False  False  False  False
9  False  False  False  False  False

更改为str后

df['a']=df['a'].astype(str)

df.isnull()
Out[332]: 
       a      b      c      d      e
0  False  False  False  False  False
1  False  False  False  False  False
2  False  False  False  False  False
3  False  False  False  False  False
4  False  False  False  False  False
5  False  False  False  False  False
6  False  False  False  False  False
7  False  False  False  False  False
8  False  False  False  False  False
9  False  False  False  False  False

您将空值np.nan更改为字符串'nan'

df.iloc[0,0]
Out[334]: 'nan'

如何将数据帧列转换为字符串并替换nans（fillna不工作）

2 个答案: