Question

我有一个输入数据帧/数据表，如下所示：

ID location jobid
X   city1    1
X   city1    2
Y   city2    3
Y   city3    4
Z   city1    5
X   city1    6

我想返回那些值ID和位置在连续行中重复的行，以及一个计数（类似于所选列的bash中的uniq -c）：所需的输出：

ID location count 
X   city1     2

知道如何在R中执行此操作？我喜欢cumsum，但不能正确.. 谢谢！

Answer 1

快速解决方案：

DF <- 
read.csv(
text="ID,location,jobid
X,city1,1
X,city1,2
Y,city2,3
Y,city3,4
Z,city1,5
X,city1,6",as.is=T)

count <- rle(paste(DF$ID,DF$location,sep='|'))$lengths

res <- cbind(DF[cumsum(count),c('ID','location')],count)

> res
  ID location   count
2  X    city1       2
3  Y    city2       1
4  Y    city3       1
5  Z    city1       1
6  X    city1       1

如果您只想要那些连续重复的行，只需过滤res：

res[res$count > 1,]

>  ID location   count
2  X    city1       2

用计数返回R中连续的重复行（比较几列）

1 个答案: