融化多排

时间:2013-08-02 13:05:54

标签: r reshape reshape2

我有一个宽格式表,前三行用于描述表中显示的数据。例如:

Company:               |  Company A  |  Company B  |  Company C  |       |  Company N
Data source:           |  Budget     |  Actual     |  Budget     |  ...  |    ...
Currency:              |  USD        |  EUR        |  USD        |       |    ...
Indicator:
 Sales                    500            1000         1500        ...       ...
 Gross Income             200            300           400        ...       ...
 ...                      ...            ...           ...        ...       ...
 Indicator J              ...            ...           ...        ...

我想通过以下布局将其重新整理为长格式:

Indicator | Company   | Currency | Data Source | Value
 Sales    | Company A |   USD    | Budget      | 500
 Sales    | Company B |   EUR    | Actual      | 1000
 ...      |    ...    |    ...   |    ...      |  ...

我试图用reshape2包融化它,但没有设法将第2行和第3行转换为变量

dput(AAA)
structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income", 
"Indicator:", "Net Income", "Sales", "Source:"), class = "factor"), 
    V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
    "1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"), 
    V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
    "1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"), 
    V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
    "185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"), 
    V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
    "195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1", 
"V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, 
-8L))

1 个答案:

答案 0 :(得分:2)

这是一个解决方案,涉及转置您的数据并进行一些清理。休息是通过'融化'完成的:

    AAA <- structure(list(V1 = structure(c(1L, 8L, 2L, 5L, 7L, 4L, 3L, 6L
), .Label = c("Company:", "Currency:", "EBITDA", "Gross Income", 
              "Indicator:", "Net Income", "Sales", "Source:"), class = "factor"), 
               V2 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
                                                                            "1000", "150", "25", "300", "Budget", "Company A", "USD"), class = "factor"), 
               V3 = structure(c(7L, 6L, 8L, 1L, 2L, 5L, 3L, 4L), .Label = c("", 
                                                                            "1500", "175", "30", "400", "Actual", "Company B", "USD"), class = "factor"), 
               V4 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
                                                                            "185", "2000", "45", "500", "Budget", "Company C", "EUR"), class = "factor"), 
               V5 = structure(c(7L, 6L, 8L, 1L, 3L, 5L, 2L, 4L), .Label = c("", 
                                                                            "195", "2500", "50", "700", "Actual", "Company D", "EUR"), class = "factor")), .Names = c("V1", 
                                                                                                                                                                      "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                   -8L))
# transpose data
dft <- data.frame(t(AAA), stringsAsFactors=FALSE)

require(reshape2)
# set colnames
colnames(dft) <- dft[1, ]
dft <- dft[-1, ]

# remove empty indicator col
dft[ , 4] <- NULL

# melt data
melt(dft, id.vars=c('Company:', 'Source:', 'Currency:'), variable.name='Indicator:')

# Company: Source: Currency:   Indicator: value
# 1  Company A  Budget       USD        Sales  1000
# 2  Company B  Actual       USD        Sales  1500
# 3  Company C  Budget       EUR        Sales  2000
# 4  Company D  Actual       EUR        Sales  2500

也许你需要更多清洁(现在每个col都是角色,也可能在转置之前设置colnames ......)。