R中的DataTable,将具有特定值类别的行格式化为百分比

时间:2016-01-11 18:19:53

标签: r data.table percentage

如果我有一个数据表,我的目标是将包含MONTH =“百分比变化:”的任何行更改为百分比:

             MONTH YEAR                    Client      Revenue  Metric 1          Metric 2         Metric 3
1:          MTD: 1 2015                  Client A 255999.33000 5.251913e+07     3.476303e+07   0.66191181
2:          MTD: 1 2016                  Client A 393450.17000 6.676211e+07     3.557979e+07   0.53293384
3: Percent Change: 2016                  Client A     53.69187 2.711961e+01     2.349502e+00 -19.48567206
4:          MTD: 1 2016                  Client B 178793.62000 1.339837e+09     2.527131e+07   0.01886148
5: Percent Change: 2016                  Client B           NA           NA               NA           NA
6:          MTD: 1 2015                  Client C  98492.19000 1.535520e+08     2.213594e+07   0.14415924
    Metric 4
1:  7.364126
2: 11.058249
3: 50.163774
4:  7.074964
5:        NA
6:  4.449424

datatable(df) %>%
  formatPercentage(df[which(df$MONTH== "Percent Change:"),],2)

如何仅将数据表中的百分比变化行格式化为百分比?

预期产出:

             MONTH YEAR                    Client      Revenue  Metric 1          Metric 2         Metric 3
1:          MTD: 1 2015                  Client A 255999.33000 5.251913e+07     3.476303e+07   0.66191181
2:          MTD: 1 2016                  Client A 393450.17000 6.676211e+07     3.557979e+07   0.53293384
3: Percent Change: 2016                  Client A     53.69187% 2.711961e+01%     2.349502e+00% -19.48567206%
4:          MTD: 1 2016                  Client B 178793.62000 1.339837e+09     2.527131e+07   0.01886148
5: Percent Change: 2016                  Client B           NA           NA               NA           NA
6:          MTD: 1 2015                  Client C  98492.19000 1.535520e+08     2.213594e+07   0.14415924
    Metric 4
1:  7.364126
2: 11.058249
3: 50.163774%
4:  7.074964
5:        NA
6:  4.449424

dput: 

function (x, file = "", control = c("keepNA", "keepInteger", 
    "showAttributes")) 
{
    if (is.character(file)) 
        if (nzchar(file)) {
            file <- file(file, "wt")
            on.exit(close(file))
        }
        else file <- stdout()
    opts <- .deparseOpts(control)
    if (isS4(x)) {
        clx <- class(x)
        cat("new(\"", clx, "\"\n", file = file, sep = "")
        for (n in methods::.slotNames(clx)) {
            cat("    ,", n, "= ", file = file)
            dput(methods::slot(x, n), file = file, control = control)
        }
        cat(")\n", file = file)
        invisible()
    }
    else .Internal(dput(x, file, opts))
}
<bytecode: 0x000000003031c860>
<environment: namespace:base>

尝试运行我的数据表时出现的错误是:

Error in names[name] : invalid subscript type 'list'

1 个答案:

答案 0 :(得分:2)

我们可以将第4列转换为第8列作为character类。然后,使用i中的逻辑条件,我们遍历第4到第8列,paste %并将其分配(:=)回到列。

library(data.table)
setDT(df)[, (4:8) := lapply(.SD, as.character), .SDcols= 4:8]
df[MONTH=="Percent Change:", (4:8) := 
    lapply(.SD, function(x) paste0(x[!is.na(x)],"%")), .SDcols=4:8]