Question

我需要在df中创建一个变量，我根据split的结果分配一个唯一的顺序值。我一直在寻找，我发现split（）可以帮助我。但是我仍然坚持如何分配顺序值。

我的数据的简化形式为

END say 'end';
say 'middle';
BEGIN say 'begin';

我做了begin middle end

我希望将标识放在另一个变量（df $ order）中，其中第一个分割中的每一行都是一个值，第二个是第二个和第三个，依此类推。我在R中比较新，我不能循环。

我想要的输出就像

structure(list(Year = c(2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 
2014L, 2014L), Session = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = "July", class = "factor"), SiteName = structure(c(2L, 
2L, 1L, 1L, 4L, 4L, 3L, 3L), .Label = c("Kaoshe", "Matoa", 
"Livingi", "Sedina"), class = "factor"), Temp = c(23L, 12L, 15L, 
27L, 30L, 21L, 21L, 21L)), .Names = c("Year", "Session", "SiteName", 
"Temp"), class = "data.frame", row.names = c(NA, -8L))

Answer 1

我们可以使用.GRP

中的data.table

library(data.table)
setDT(df)[, order := .GRP, .(SiteName, Session, Year)]

或base R

df$order <- cumsum(!duplicated(df[1:3]))
df$order
#[1] 1 1 2 2 3 3 4 4

吐并分配身份

1 个答案: