Question

shift将我的列从整数转换为浮点数。事实证明np.nan只是浮动的。有没有办法将移位列保持为整数？

df = pd.DataFrame({"a":range(5)})
df['b'] = df['a'].shift(1)

df['a']
# 0    0
# 1    1
# 2    2
# 3    3
# 4    4
# Name: a, dtype: int64

df['b']

# 0   NaN
# 1     0
# 2     1
# 3     2
# 4     3
# Name: b, dtype: float64

Answer 1

问题是NaN值为float，因此int转换为float - 请参阅na type promotions。

一种可能的解决方案是将NaN值转换为某些值，例如0，然后可以转换为int：

df = pd.DataFrame({"a":range(5)})
df['b'] = df['a'].shift(1).fillna(0).astype(int)
print (df)
   a  b
0  0  0
1  1  0
2  2  1
3  3  2
4  4  3

Answer 2

您可以通过将numpy添加到列0的最后一个元素之外的所有元素来构建a数组

df.assign(b=np.append(0, df.a.values[:-1]))

   a  b
0  0  0
1  1  0
2  2  1
3  3  2
4  4  3

Answer 3

自熊猫1.0.0起，我相信您还有另一种选择，那就是首先使用convert_dtypes。这样可以将数据框列转换为支持pd.NA的dtype，从而避免了NaN问题。

df = pd.DataFrame({"a":range(5)})
df = df.convert_dtypes()
df['b'] = df['a'].shift(1)

print(df['a'])
# 0    0
# 1    1
# 2    2
# 3    3
# 4    4
# Name: a, dtype: Int64

print(df['b'])
# 0    <NA>
# 1       0
# 2       1
# 3       2
# 4       3
# Name: b, dtype: Int64

Answer 4

从熊猫版本0.24.0开始的另一种解决方案：只需为参数fill_value提供一个值：

df['b'] = df['a'].shift(1, fill_value=0)

Answer 5

另一个解决方案是使用replace（）函数和类型转换

df['b'] = df['a'].shift(1).replace(np.NaN,0).astype(int)

pandas shift将我的列从整数转换为float。

5 个答案: