Question

我的数据框架如下所示

> tornado_frame
         tornado_names Level      value
1     node per cluster   low  -34.72222
2          TB per node   low  -52.08333
3  expense per cluster   low -104.16667
4             Total TB   low  -62.50000
5  revenue per cluster   low  -52.08333
6     node per cluster  high   20.83333
7          TB per node  high   41.66667
8  expense per cluster  high   52.08333
9             Total TB  high  145.83333
10 revenue per cluster  high  156.25000

我希望表格转换为此

> tornado_frame
         tornado_names Level      value
1     node per cluster   low   34.72222
2          TB per node   low   52.08333
3  expense per cluster   low  104.16667
4             Total TB   low  -62.50000
5  revenue per cluster   low  -52.08333
6     node per cluster  high  -20.83333
7          TB per node  high  -41.66667
8  expense per cluster  high  -52.08333
9             Total TB  high  145.83333
10 revenue per cluster  high  156.25000

如果“绝对值”大于“高”级别列和相同tornado_name列的负号，则“值”中的负号会发生变化。

我尝试了几个嵌套if，但这对我来说很麻烦。任何帮助将不胜感激！

这是我的数据：

> dput(tornado_frame)
structure(list(tornado_names = structure(c(2L, 4L, 1L, 5L, 3L, 
2L, 4L, 1L, 5L, 3L), .Label = c("expense per cluster", "node per cluster", 
"revenue per cluster", "TB per node", "Total TB"), class = "factor"), 
    Level = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L
    ), .Label = c("high", "low"), class = "factor"), value = c(34.72222, 
    52.08333, 104.16667, -62.5, -52.08333, -20.83333, -41.66667, 
    -52.08333, 145.83333, 156.25)), .Names = c("tornado_names", 
"Level", "value"), class = "data.frame", row.names = c(NA, -10L
))

Answer 1

这是一个可能的data.table解决方案

library(data.table)
setDT(df)[, value := if(diff(abs(value)) < 0) value * -1,
                                            by = tornado_names]
df
#           tornado_names Level     value
#  1:    node per cluster   low  34.72222
#  2:         TB per node   low  52.08333
#  3: expense per cluster   low 104.16667
#  4:            Total TB   low -62.50000
#  5: revenue per cluster   low -52.08333
#  6:    node per cluster  high -20.83333
#  7:         TB per node  high -41.66667
#  8: expense per cluster  high -52.08333
#  9:            Total TB  high 145.83333
# 10: revenue per cluster  high 156.25000

这将检查您的条件是否为tornado_names，并且仅更改满足条件的组内的值的符号。

基于多个其他列有条件地替换数据帧列中的值 - R.

1 个答案: