Question

我有以下Datarame

df = pd.DataFrame({
    'col_1': [0, 1, 2, 3],
    'col_2': [4, 5, 6, 7],
    'col_3': [14, 15, 16, 19]
})

我尝试将数字转换为字符串，然后将每行合并为一个字符串

我可以通过使用：

来实现这一目标

df.apply(lambda x : ''.join(x.astype(str)),1) 

Out[209]: 
0    0414
1    1515
2    2616
3    3719
dtype: object# notice here dtype is object

这是问题

然后，我尝试使用sum

df.astype(str).sum(1)
Out[211]: 
0     414.0
1    1515.0
2    2616.0
3    3719.0
dtype: float64

请注意，dtype变为float而不是object。

以下是更多信息：

df.astype(str).applymap(type)
Out[221]: 
           col_1          col_2          col_3
0  <class 'str'>  <class 'str'>  <class 'str'>
1  <class 'str'>  <class 'str'>  <class 'str'>
2  <class 'str'>  <class 'str'>  <class 'str'>
3  <class 'str'>  <class 'str'>  <class 'str'>

为什么sum有这种有线行为？有没有办法阻止它将str转换回float？

感谢您的帮助： - ）

Answer 1

如果您想使用某些，可以尝试这种方式：

df.astype(str).apply(lambda x: x.sum(),1)

输出：

0    0414
1    1515
2    2616
3    3719
dtype: object

Answer 2

Sum不起作用，因为在返回系列时因为只有数字转换为相应的float dtype格式。仅当object应用标准函数时，它才会为mixed datatype。

例如，当你做

时

df = pd.DataFrame({
    'col_1': [0, 1, 2, 3],
    'col_2': [4, 5, 6, 7],
    'col_3': [14, 15, 16, 'b']
})

df.astype(str).sum(1)

输出：

  
0    0414
1    1515
2    2616
3     37b
dtype: object

总和的另一种选择是使用cumsum，因此dtype将被保留，即

s = df.astype(str).cumsum(1).iloc[:,-1]

输出：

0    0414
1    1515
2    2616
3    3719
Name: col_3, dtype: object

希望有所帮助

如何在pandas

2 个答案: