Question

其他人给了我一个看起来像这样的数据集。

input = c(0,0,0,0,1,0,0,0,0,0,0,2,0,1,0,0,2)

我需要做的是用下一个非零数字填充0（它将始终为1或2）。此外，下一个非零数字需要填写下一个（最后一个值可以是任何东西，因为我打算将它设置为NA）。

所以函数应该返回

> output =c(1,1,1,1,2,2,2,2,2,2,2,1,1,2,2,2,NA)
> cbind(input,output)
      input output
 [1,]     0      1
 [2,]     0      1
 [3,]     0      1
 [4,]     0      1
 [5,]     1      2
 [6,]     0      2
 [7,]     0      2
 [8,]     0      2
 [9,]     0      2
[10,]     0      2
[11,]     0      2
[12,]     2      1
[13,]     0      1
[14,]     1      2
[15,]     0      2
[16,]     0      2
[17,]     2     NA

谢谢！

- 编辑部分 -

输出只需要是数组/向量（或者R中的任何正确术语）。示例我将2绑定在一起以证明偏移量为1以及填充。谢谢你们的回答

Answer 1

将Options +FollowSymlinks RewriteEngine On RewriteCond %{REQUEST_URI} \.(jpg)$ [NC] RewriteCond %{HTTP_REFERER} !^http://(www\.)?mywebsite.fr/Photographies.html [nc] RewriteRule ^(.*)/ http://www.mywebsite.fr/Photographies.html [L]值设置为0并使用NA：

na.locf

来自input[input==0] <- NA na.locf(input, fromLast=TRUE) ## [1] 1 1 1 1 1 2 2 2 2 2 2 2 1 1 2 2 2：

将每个NA替换为最近的非NA之前的通用函数。

Answer 2

output=input[!!input][cumsum(!!input)+1]
#[1]  1  1  1  1  2  2  2  2  2  2  2  1  1  2  2  2 NA

我们利用R如何将数字强制转换为逻辑。 as.logical(0:2)将返回FALSE TRUE TRUE。零成为FALSE，其他数字被视为TRUE。将否定感叹号置于input前面将其强制为逻辑。我本来可以写as.logical(input)，这只是一个节省一些按键的技巧。因此，我们使用该逻辑索引将非零值与input[!!input]进行子集化。逻辑索引cumsum(!!input)+1的累积总和创建了一种快速方法，可以在向其添加更改点时对其进行索引。它有助于分别运行每个部分。

Answer 3

替代：

#find non-0s
x<-which(input!=0)
#replace 0s with subsequent nonzeros;
#  input[x] is the elements to replace
#  diff(c(0,x))-1 is the number of times to repeat
#    we need to pad x with a 0 to get initial 0s,
#    and we need to subtract 1 to ignore the nonzeros themselves
input[input==0]<-rep(input[x],diff(c(0,x))-1)
#replace nonzeros with the subsequent nonzeros
#  x[-1] removes the first element
#  we need to pad with NA as the final element
input[x]<-c(input[x[-1]],NA)

我再次看，它可能有点神秘;如果您想要精简，请告诉我

编辑：

以上内容对您的input完全正常，但如果有任何尾随0则会失败。如果您的向量0上有v个跟踪，则此内容较为复杂但有效：

#as above
x<-which(v!=0)
#now, we need to integrate more NA replacement to this step
#  since we allow for 0s to finish v, also pad x with an end-of-vector marker
v[v==0]<-rep(c(v[x],NA),diff(c(0,x,length(v)+1))-1)
v[x]<-c(v[x[-1]],NA)

即便如此，它比@ MatthewLundberg的建议（以可读性为代价）要快得多（几乎是4倍）。然而，@ PierreLafortune的回答至高无上。仍然不确定它是如何工作的......

test<-sample(0:2,size=1e6,rep=T,prob=table2(input,prop=T))
microbenchmark(times=100L,matt(test),mike(test))

Unit: milliseconds
       expr       min        lq      mean    median        uq       max neval
 matt(test) 130.23987 130.70963 135.02442 131.23443 131.93079 183.73196   100
 mike(test)  36.27610  36.48493  36.66895  36.58639  36.73403  38.12300   100
 plaf(test)  22.18888  22.30271  23.08294  22.43586  22.65402  76.95877   100

将R替换为R

3 个答案: