我有一个宽格式表,前三行用于描述表中显示的数据。例如:
Company: | Company A | Company B | Company C | | Company N
Data source: | Budget | Actual | Budget | ... | ...
Currency: | USD | EUR | USD | | ...
Indicator:
Sales 500 1000 1500 ... ...
Gross Income 200 300 400 ... ...
... ... ... ... ... ...
Indicator J ... ... ... ...
我想通过以下布局将其重新整理为长格式:
Indicator | Company | Currency | Data Source | Value
Sales | Company A | USD | Budget | 500
Sales | Company B | EUR | Actual | 1000
... | ... | ... | ... | ...
我试图用reshape2包融化它,但没有设法将第2行和第3行转换为变量
dput(AAA)
structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income",
"Indicator:", "Net Income", "Sales", "Source:"), class = "factor"),
V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("",
"1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"),
V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("",
"1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"),
V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("",
"185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"),
V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("",
"195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1",
"V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA,
-8L))
答案 0 :(得分:2)
这是一个解决方案,涉及转置您的数据并进行一些清理。休息是通过'融化'完成的:
AAA <- structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income",
"Indicator:", "Net Income", "Sales", "Source:"), class = "factor"),
V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("",
"1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"),
V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("",
"1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"),
V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("",
"185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"),
V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("",
"195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1",
"V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA,
-8L))
# transpose data
dft <- data.frame(t(AAA), stringsAsFactors=FALSE)
require(reshape2)
# set colnames
colnames(dft) <- dft[1, ]
dft <- dft[-1, ]
# remove empty indicator col
dft[ , 4] <- NULL
# melt data
melt(dft, id.vars=c('Company:', 'Source:', 'Currency:'), variable.name='Indicator:')
# Company: Source: Currency: Indicator: value
# 1 Company A Budget USD Sales 1000
# 2 Company B Actual USD Sales 1500
# 3 Company C Budget EUR Sales 2000
# 4 Company D Actual EUR Sales 2500
也许你需要更多清洁(现在每个col都是角色,也可能在转置之前设置colnames ......)。