我有df_wide数据框,用于以宽格式显示公司数据
df_wide <- data.frame(Company=c('CompanyA','CompanyB', 'CompanyC'),
Industry=c('Manufacturing', 'Telecom', 'Services'),
Sales.2015=c('100', '500', '1000'),
Sales.2016=c('110', '550', '1100'),
Sales.2017=c('120', '600', '1200'),
EBITDA.2015=c('10', '50', '100'),
EBITDA.2016=c('11', '55', '110'),
EBITDA.2017=c('12', '60', '120'))
Company Industry Sales.2015 Sales.2016 Sales.2017 EBITDA.2015 EBITDA.2016 EBITDA.2017
1 CompanyA Manufacturing 100 110 120 10 11 12
2 CompanyB Telecom 500 550 600 50 55 60
3 CompanyC Services 1000 1100 1200 100 110 120
我希望将数据转换为df_long之类的长格式
df_long <- data.frame(Company=c('CompanyA', 'CompanyA', 'CompanyA', 'CompanyB', 'CompanyB','CompanyB','CompanyC','CompanyC', 'CompanyC'),
Industry=c('Manufacturing','Manufacturing','Manufacturing','Telecom','Telecom','Telecom','Services','Services','Services'),
Year=c('2015','2016','2017','2015','2016','2017','2015','2016','2017'),
Sales=c('100','110','120','500', '550','600','1000','1100','1200'),
EBITDA=c('10','11','12','50','55','60','100','110','120'))
Company Industry Year Sales EBITDA
1 CompanyA Manufacturing 2015 100 10
2 CompanyA Manufacturing 2016 110 11
3 CompanyA Manufacturing 2017 120 12
4 CompanyB Telecom 2015 500 50
5 CompanyB Telecom 2016 550 55
6 CompanyB Telecom 2017 600 60
7 CompanyC Services 2015 1000 100
8 CompanyC Services 2016 1100 110
9 CompanyC Services 2017 1200 120
我尝试过使用pivot_longer,并且仅使用一个变量就可以正常工作,但是在尝试同时调整销售和EBITDA时却遇到了困难。
df_long2 <- df_wide %>% pivot_longer(cols = starts_with("Sales"),
names_to = "Year",
values_to = "Sales")
答案 0 :(得分:5)
使用pivot_longer
tidyr::pivot_longer(df_wide,
cols = -c(Company, Industry),
names_to = c(".value", "Year"),
names_sep = "\\.") %>% type.convert()
# Company Industry Year Sales EBITDA
# <fct> <fct> <int> <int> <int>
#1 CompanyA Manufacturing 2015 100 10
#2 CompanyA Manufacturing 2016 110 11
#3 CompanyA Manufacturing 2017 120 12
#4 CompanyB Telecom 2015 500 50
#5 CompanyB Telecom 2016 550 55
#6 CompanyB Telecom 2017 600 60
#7 CompanyC Services 2015 1000 100
#8 CompanyC Services 2016 1100 110
#9 CompanyC Services 2017 1200 120
答案 1 :(得分:1)
Base R解决方案:
df_long <-
reshape(df_wide,
direction = "long",
varying = which(!names(df_wide) %in% c("Company", "Industry")),
ids = NULL,
new.row.names = 1:(length(which(!names(df_wide) %in% c("Company", "Industry"))) * nrow(df_wide))
)
答案 2 :(得分:0)
我还不熟悉return
,但这是一个pivot_longer()
解决方案:
data.table
答案 3 :(得分:0)
这里是base R
的解决方案(类似于@hello_friend的解决方案),其中reshape()
用于使表从宽到长:
df_long <- reshape(df_wide,
direction = "long",
varying = seq(df_wide)[-(1:2)],
ids = NULL,
timevar = "Year",
times = unique(gsub("\\w+\\.(.*)","\\1",names(df_wide[-(1:2)]))),
new.row.names = seq(ncol(df_wide[-(1:2)])*nrow(df_wide))
)
如此
> df_long
Company Industry Year Sales EBITDA
1 CompanyA Manufacturing 2015 100 10
2 CompanyB Telecom 2015 500 50
3 CompanyC Services 2015 1000 100
4 CompanyA Manufacturing 2016 110 11
5 CompanyB Telecom 2016 550 55
6 CompanyC Services 2016 1100 110
7 CompanyA Manufacturing 2017 120 12
8 CompanyB Telecom 2017 600 60
9 CompanyC Services 2017 1200 120