Question

我正在尝试遍历data.frame中的一列，并将data（无效）替换为data.frame中每个唯一ID的下一个有效值，不等于9。

我对dplyr，lapply没有运气，我一直在努力寻找类似的问题但无济于事。

#dummy data set
id<-c(1,1,1,1,2,2,2,2)
ind<-c(9,9,9,1,9,9,9,4)
df<-data.frame(id,ind)

#unique doesn't get me what I want
#If I do (i in 1:4) it will work for the first df$id but obviously not the 2nd.
for (i in unique(length(df$id)))
  {
    j=df$ind!=9
    df$ind[i]<-df$ind[j]
  }

unique length(df)将无效，因此我基本上无法将循环仅应用于df$id值的子集。我认为如果我能够克服这个问题，这将会奏效。其他非循环解决方案也将受到重视。

Answer 1

如果你想使用unique()，你可以这样做。可能有一种更为分类的方式，但这将在基础R中起作用：

df <- lapply(unique(df$id), function(x){
  temp <- df[df$id == x,]
  temp[temp$ind == 9, 'ind'] <- temp[which.max(temp$ind != 9), 'ind']
  temp
})
do.call(rbind, df)

r - 使用基于唯一ID

1 个答案: