将时间序列数据从宽格式转换为长格式

时间:2015-07-02 02:43:43

标签: r time-series pivot-table

给出四个客户的时间序列数据,客户1,2,3,4。

names(data)
"Date" "X1.CLIENT"   "X1seasonal"  "X1trend"  "X1remainder" "X2.CLIENT" 
  "X2seasonal"  "X2trend"     "X2remainder" "X3.CLIENT"   "X3seasonal"  
"X3trend"     "X3remainder" "X4.CLIENT"   "X4seasonal"  "X4trend"     
"X4remainder"

请注意,在同一时间段内,每个客户数据后面跟着季节性,趋势和余数组件。

我想以一种看起来像

的方式重塑成长格式
"Date" "CLIENT_Number" "Type"  "Value"

[Client_Number:1,2,3,4;类型:客户,季节性,趋势,余数]

(理解的其他帮助:如果数据(宽格式)有30行/个实例,那么我们必须转换为长格式(30 * 4 * 4 = 480行/实例)

1 个答案:

答案 0 :(得分:0)

看起来您的数据是一种笨拙的重复宽格式。正如ssdecontrol所说,使用来自reshape2的熔化物,我会像lapply这样的东西组合成分裂。 e.g:

# Set up
install.packages("reshape2")
library(reshape2)

# Dummy data
d = data.frame(date=seq.Date(as.Date("2015-01-01"), length.out=100, by=1),
Client.1="A", trend.1=rnorm(100), remainder.1=rpois(10, 3), Client.2="B",
trend.2=rnorm(100), remainder.2=rpois(10, 3), Client.3="C", trend.3=rnorm(100),
remainder.3=rpois(10, 3))

# Split data by columns with client numbers
d = lapply(c(2,5,8), function(i){
    # Take columns relating to client i
    x = d[,c(1, seq(i, length.out=3, by=1))]
    # Rename so you have consistent factor labels
    names(x)[2:4] = c("client", "trend", "remainder")
    # Conververt from wide to long
    x = melt(x, id.vars=c("date", "client"))
})

# Make list into a data frame
d = do.call("rbind", d)

我发现此帖非常清楚广泛和长格式:http://seananderson.ca/2013/10/19/reshape.html