执行以下可重现的示例代码时,
value = c(1:5)
name = c('A','B','C','D','F')
`2015.cost` = c(100,200,250,300,450)
`2016.cost` = c(200,300,350,400,550)
`2017.cost` = c(300,400,450,500,650)
`2015.profit` = c(1000,4200,2450,1500,7650)
`2016.profit` = c(1300,4300,3450,5100,6850)
`2017.profit` = c(1300,4400,6450,5001,6500)
df <- data.frame(value,name,`2015.cost`,`2016.cost`,`2017.cost`,`2015.profit`,`2016.profit`,`2017.profit`)
我创建了以下数据框:
structure(list(value = 1:5, name = structure(1:5, .Label = c("A",
"B", "C", "D", "F"), class = "factor"), X2015.cost = c(100, 200,
250, 300, 450), X2016.cost = c(200, 300, 350, 400, 550), X2017.cost = c(300,
400, 450, 500, 650), X2015.profit = c(1000, 4200, 2450, 1500,
7650), X2016.profit = c(1300, 4300, 3450, 5100, 6850), X2017.profit = c(1300,
4400, 6450, 5001, 6500)), .Names = c("value", "name", "X2015.cost",
"X2016.cost", "X2017.cost", "X2015.profit", "X2016.profit", "X2017.profit"
), row.names = c(NA, -5L), class = "data.frame")
我本质上希望能够解析我的成本(在'。'这样的分隔符)和利润列名称的年份,然后将数据框从这种宽格式转移到长,这样我就有多年了所有在列和成本,利润值作为另一列。我的目标是建立一个像这样的数据框:
value name year cost profit
1 A 2015 100 1000
2 B 2015 200 4200
3 C 2015 250 2450
4 D 2015 300 1500
5 E 2015 450 7650
1 A 2016 200 1300
2 B 2016 300 4300
3 C 2016 350 3450
4 D 2016 400 5100
5 E 2016 550 6850
1 A 2017 300 1300
2 B 2017 400 4400
3 C 2017 450 6450
4 D 2017 500 5001
5 E 2017 650 6500
任何帮助将不胜感激
答案 0 :(得分:4)
这是 dplyr 和 tidyr
的一种方式library(dplyr)
library(tidyr)
df %>%
gather('col', 'val', X2015.cost:X2017.profit) %>%
separate(col, c('year', 'col_type'), sep='\\.') %>%
mutate(year=extract_numeric(year)) %>%
spread(col_type, val)
答案 1 :(得分:3)
我们可以使用melt
中的data.table
,measure
可以使用多个library(data.table)
dM <- melt(setDT(df1), measure=patterns('cost$', 'profit$'),
variable.name='year', value.name=c('cost', 'profit'))
dM[ ,year :=as.numeric(unique(gsub('\\D+', '',
grep('^X\\d+', names(df1), value=TRUE))))[year]]
dM
# value name year cost profit
# 1: 1 A 2015 100 1000
# 2: 2 B 2015 200 4200
# 3: 3 C 2015 250 2450
# 4: 4 D 2015 300 1500
# 5: 5 F 2015 450 7650
# 6: 1 A 2016 200 1300
# 7: 2 B 2016 300 4300
# 8: 3 C 2016 350 3450
# 9: 4 D 2016 400 5100
#10: 5 F 2016 550 6850
#11: 1 A 2017 300 1300
#12: 2 B 2017 400 4400
#13: 3 C 2017 450 6450
#14: 4 D 2017 500 5001
#15: 5 F 2017 650 6500
列。
{{1}}