Question

我想创建一个包含许多重复值的简单向量。这很容易在R：

> numbers <- c(rep(1,5), rep(2,4), rep(3,3))
> numbers
[1] 1 1 1 1 1 2 2 2 2 3 3 3

但是，如果我尝试使用pandas和numpy在Python中执行此操作，我不会完全相同：

numbers = pd.Series([np.repeat(1,5), np.repeat(2,4), np.repeat(3,3)])
numbers
0    [1, 1, 1, 1, 1]
1       [2, 2, 2, 2]
2          [3, 3, 3]
dtype: object

Python中的R等价物是什么？

Answer 1

只需调整np.repeat

的使用方式即可

np.repeat([1, 2, 3], [5, 4, 3])

array([1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3])

或pd.Series

pd.Series(np.repeat([1, 2, 3], [5, 4, 3]))

0     1
1     1
2     1
3     1
4     1
5     2
6     2
7     2
8     2
9     3
10    3
11    3
dtype: int64

也就是说，复制你在R中所做的最纯粹的形式是将np.concatenate与np.repeat结合使用。这不是我建议做的。

np.concatenate([np.repeat(1,5), np.repeat(2,4), np.repeat(3,3)])

array([1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3])

Answer 2

现在你可以在 python 中使用相同的语法了：

>>> from datar.base import c, rep
>>>
>>> numbers = c(rep(1,5), rep(2,4), rep(3,3))
>>> print(numbers)
[1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3]

我是 datar 包的作者。如果您有任何问题，请随时提交问题。

大熊猫相当于R系列的多个重复数字

2 个答案: