我有一个长格式的数据集,并希望使用Reshape或Reshape之前的任何预处理将其转换为宽格式。困难在于“值”变量是非数字的。请注意,原始数据中也存在合法的重复记录。以下代码显示了每个的数据布局。
id = c(1, 1, 1, 1, 1, 1, 1)
month <- c("jan", "feb", "feb", "march", "april", "april", "april")
stress <- c("mild", "mild", "high", "moderate", "mild", "high", "mild")
Longdata <- data.frame(id, month, stress, stringsAsFactors = FALSE)
这是原始格式:
> Longdata
id month stress
1 1 jan mild
2 1 feb mild
3 1 feb high
4 1 march moderate
5 1 april mild
6 1 april high
7 1 april mild
这就是我想要组织数据的方式:
id <- c(1)
jan <- c("mild")
feb <- c("mild-high")
march <- c("moderate")
april <- c("mild-high-mild")
widedata <- data.frame(id, jan, feb, march, april, stringsAsFactors = FALSE)
> widedata
id jan feb march april
1 1 mild mild-high moderate mild-high-mild
答案 0 :(得分:0)
您可以分两步完成此操作,首先使用aggregate
,然后使用“reshape2”包中的基础R reshape
或dcast
。
汇总步骤:
Mediumdata <- aggregate(stress ~ id + month, Longdata, paste, collapse="-")
Mediumdata
# id month stress
# 1 1 april mild-high-mild
# 2 1 feb mild-high
# 3 1 jan mild
# 4 1 march moderate
重塑步骤:
# Using base R reshape
reshape(Mediumdata, direction="wide", idvar="id", timevar="month")
# id stress.april stress.feb stress.jan stress.march
# 1 1 mild-high-mild mild-high mild moderate
# Using `dcast` from "reshape2"
dcast(mediumdata, id ~ month, value.var="stress")
# id april feb jan march
# 1 1 mild-high-mild mild-high mild moderate