Question

我有一些列['subject', 'H.period', 'DD.period.t']等。实际上，所有列都是对象类型。

dtype printscreen

如何将这些列转换为字符串类型？

以及如何使用.replace将“，”转换为“。”在CSV文件中？我需要在机器学习K邻居算法中使用这些数据。

Answer 1

dtype中没有字符串pandas。如docs中所述：

注意：使用异构数据时，将选择结果ndarray的dtype来容纳所有涉及的数据。例如，如果包含字符串，则结果将是对象dtype。如果只有浮点数和整数，则结果数组将为float dtype。

要在整个数据框中将,替换为.，请将replace与regex = True一起使用：

df = df.replace(',','.',regex=True)
# or
df.replace(',','.',regex=True, inplace = True)

例如：：如果您的数据框df如下：

>>> df
  col1         col2
0  x,x    blah,blah
1  y,z  hello,world
2  z.z       ,.,.,.

然后：

df = df.replace(',','.',regex=True)
>>> df
  col1         col2
0  x.x    blah.blah
1  y.z  hello.world
2  z.z       ......

Answer 2

尽管dtype确实是“对象”，但是将type（）函数分别应用于列标签时，您会发现它们确实属于类“ str”。这样就可以了。

关于您有关更换的问题，我会提出这样的建议：

length = len(df[df.columns[0]])
for column in df.columns:
     for index in range(length):
          df[column][index] = df[column][index].replace(",",".")

如何将对象列转换为字符串并使用替换？

2 个答案: