如何用R中每个ID的最后一个非零值替换全0?
示例:
输入:
df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0))
输出:
df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0),
res = c(0,10,30,30,30,50,80,0,0,57,57))
有滞后功能的简单方法吗?
答案 0 :(得分:1)
这是一种整洁的方法:
library(tidyverse)
df %>%
group_by(ID) %>%
mutate(x = replace(Var1, cumsum(Var1 !=0) > 0 & Var1 == 0, NA)) %>%
fill(x)
# # A tibble: 11 x 4
# # Groups: ID [2]
# ID Var1 res x
# <dbl> <dbl> <dbl> <dbl>
# 1 1. 0. 0. 0.
# 2 1. 10. 10. 10.
# 3 1. 30. 30. 30.
# 4 1. 0. 30. 30.
# 5 1. 0. 30. 30.
# 6 1. 50. 50. 50.
# 7 1. 80. 80. 80.
# 8 2. 0. 0. 0.
# 9 2. 0. 0. 0.
# 10 2. 57. 57. 57.
# 11 2. 0. 57. 57.
在变异步骤中,我们将0替换为NA,但在每次ID运行开始时除外,因为在这种情况下,我们之后没有值可替换NA。
如果要调整多个列,则可以使用:
df %>%
group_by(ID) %>%
mutate_at(vars(starts_with("Var")),
funs(replace(., cumsum(. !=0) > 0 & . == 0, NA))) %>%
fill(starts_with("Var"))
其中df可能是:
df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0),
Var2 = c(4,0, 30, 0, 0,50,0,16, 0, 57, 0))
答案 1 :(得分:1)
不使用任何软件包,仅使用loops
:
df <- data.frame(ID = c(1,1,1,1,1,1,1,2,2,2,2),
Var1 = c(0,10, 30, 0, 0,50,80,0, 0, 57, 0))
for(i in 1:nrow(df)){
if(i!=1){
if(df$ID[i-1]==df$ID[i] && df$Var1[i]==0){ # if value is zero and value of current and previous rows ID are same
if(df$Var1[i-1]!=0){ # If previous value is not zero then store it
df$res[i]=df$Var1[i-1] # Use previous value of var1
a=0
a=df$Var1[i-1]
}else{
df$res[i]=a # Use previous value var1
a=0
}
}else{
df$res[i]=df$Var1[i] # Use the current value of var1
}
}else{
df$res[i]=df$Var1[i] # Set the first point as it is
}
}
输出:
> df
ID Var1 res
1 1 0 0
2 1 10 10
3 1 30 30
4 1 0 30
5 1 0 30
6 1 50 50
7 1 80 80
8 2 0 0
9 2 0 0
10 2 57 57
11 2 0 57