我有一个data.frame,如下所示。
toolid startdate enddate stage
abc 1-Jan-13 5-Jan-13 production
abc 6-Jan-13 10-Jan-13 down
xyz 3-Jan-13 8-Jan-13 production
xyz 9-Jan-13 15-Jan-13 down
我想将data.frame转换为下面的格式。我正在尝试将上面data.frame中的列'startdate'
和'enddate'
合并到下面名为'date'
的单个列中。我拥有的原始数据在许多toolids
和许多阶段都有几千行。我已经找到了一种使用SQL的方法,但更喜欢R解决方案。我已经开始融化数据,如下面的代码所示。
toolid date stage
abc 1-Jan-13 production
abc 2-Jan-13 production
abc 3-Jan-13 production
abc 4-Jan-13 production
abc 5-Jan-13 production
abc 6-Jan-13 down
abc 7-Jan-13 down
abc 8-Jan-13 down
abc 9-Jan-13 down
abc 10-Jan-13 down
xyz 3-Jan-13 production
xyz 4-Jan-13 production
xyz 5-Jan-13 production
xyz 6-Jan-13 production
xyz 7-Jan-13 production
xyz 8-Jan-13 production
xyz 9-Jan-13 down
xyz 10-Jan-13 down
xyz 11-Jan-13 down
xyz 12-Jan-13 down
xyz 13-Jan-13 down
xyz 14-Jan-13 down
xyz 15-Jan-13 down
R代码
startdate=c('1-Jan-13','6-Jan-13','3-Jan-13','9-Jan-13')
enddate=c('5-Jan-13', '10-Jan-13', '8-Jan-13', '15-Jan-13')
toolid=c('abc', 'abc', 'xyz', 'xyz')
stage=c('production', 'down', 'production', 'down')
data=data.frame(toolid,startdate,enddate,stage)
require(reshape2)
newdata=melt(data,id.vars=c('toolid','stage'))
更新:来自@ Ananda Mahto的应对代码回答以及添加几行代码以提供数据透视表类型的输出
## Convert "startdate" and "enddate" to date objects
data$startdate <- as.Date(data$startdate, format="%d-%b-%y")
data$enddate <- as.Date(data$enddate, format="%d-%b-%y")
## Use `seq` to create the date sequence, and manually recreate
## your dataframe. `do.call(rbind, ...) to put it back together
ddd=do.call(rbind, lapply(sequence(nrow(data)), function(x) {
data.frame(toolid = data$toolid[x],
date = seq(data$startdate[x], data$enddate[x], by = 1),
stage = data$stage[x])
}))
ddd
toolid date stage
1 abc 2013-01-01 production
2 abc 2013-01-02 production
3 abc 2013-01-03 production
4 abc 2013-01-04 production
5 abc 2013-01-05 production
6 abc 2013-01-06 down
7 abc 2013-01-07 down
8 abc 2013-01-08 down
9 abc 2013-01-09 down
10 abc 2013-01-10 down
11 xyz 2013-01-03 production
12 xyz 2013-01-04 production
13 xyz 2013-01-05 production
14 xyz 2013-01-06 production
15 xyz 2013-01-07 production
16 xyz 2013-01-08 production
17 xyz 2013-01-09 down
18 xyz 2013-01-10 down
19 xyz 2013-01-11 down
20 xyz 2013-01-12 down
21 xyz 2013-01-13 down
22 xyz 2013-01-14 down
23 xyz 2013-01-15 down
ddd1=dcast(ddd,date~stage)
ddd1
date down production
1 2013-01-01 0 1
2 2013-01-02 0 1
3 2013-01-03 0 2
4 2013-01-04 0 2
5 2013-01-05 0 2
6 2013-01-06 1 1
7 2013-01-07 1 1
8 2013-01-08 1 1
9 2013-01-09 2 0
10 2013-01-10 2 0
11 2013-01-11 1 0
12 2013-01-12 1 0
13 2013-01-13 1 0
14 2013-01-14 1 0
15 2013-01-15 1 0
答案 0 :(得分:4)
我确信有更多“正确”的方法可以做到这一点,但这很快就会出现在我的脑海中。
首先,将“startdate”和“enddate”转换为日期对象
data$startdate <- as.Date(data$startdate, format="%d-%b-%y")
data$enddate <- as.Date(data$enddate, format="%d-%b-%y")
然后,使用seq
创建日期序列,并手动重新创建data.frame
。使用`do.call(rbind,...)将它重新组合在一起。
ddd <- do.call(rbind, lapply(sequence(nrow(data)), function(x) {
data.frame(toolid = data$toolid[x],
date = seq(data$startdate[x], data$enddate[x], by = 1),
stage = data$stage[x])
}))
ddd
# toolid date stage
# 1 abc 2013-01-01 production
# 2 abc 2013-01-02 production
# 3 abc 2013-01-03 production
# 4 abc 2013-01-04 production
# 5 abc 2013-01-05 production
# 6 abc 2013-01-06 down
# 7 abc 2013-01-07 down
# 8 abc 2013-01-08 down
# 9 abc 2013-01-09 down
# 10 abc 2013-01-10 down
# 11 xyz 2013-01-03 production
# 12 xyz 2013-01-04 production
# 13 xyz 2013-01-05 production
# 14 xyz 2013-01-06 production
# 15 xyz 2013-01-07 production
# 16 xyz 2013-01-08 production
# 17 xyz 2013-01-09 down
# 18 xyz 2013-01-10 down
# 19 xyz 2013-01-11 down
# 20 xyz 2013-01-12 down
# 21 xyz 2013-01-13 down
# 22 xyz 2013-01-14 down
# 23 xyz 2013-01-15 down
最后,看看你想说的最终结果,你可以一直坚持使用基础R并使用table
。我把它放在as.data.frame.matrix()
中,因为我假设你想要data.frame
作为结果:
as.data.frame.matrix(table(ddd[-1]))
# down production
# 2013-01-01 0 1
# 2013-01-02 0 1
# 2013-01-03 0 2
# 2013-01-04 0 2
# 2013-01-05 0 2
# 2013-01-06 1 1
# 2013-01-07 1 1
# 2013-01-08 1 1
# 2013-01-09 2 0
# 2013-01-10 2 0
# 2013-01-11 1 0
# 2013-01-12 1 0
# 2013-01-13 1 0
# 2013-01-14 1 0
# 2013-01-15 1 0