我的数据框架如下所示
> tornado_frame
tornado_names Level value
1 node per cluster low -34.72222
2 TB per node low -52.08333
3 expense per cluster low -104.16667
4 Total TB low -62.50000
5 revenue per cluster low -52.08333
6 node per cluster high 20.83333
7 TB per node high 41.66667
8 expense per cluster high 52.08333
9 Total TB high 145.83333
10 revenue per cluster high 156.25000
我希望表格转换为此
> tornado_frame
tornado_names Level value
1 node per cluster low 34.72222
2 TB per node low 52.08333
3 expense per cluster low 104.16667
4 Total TB low -62.50000
5 revenue per cluster low -52.08333
6 node per cluster high -20.83333
7 TB per node high -41.66667
8 expense per cluster high -52.08333
9 Total TB high 145.83333
10 revenue per cluster high 156.25000
如果“绝对值”大于“高”级别列和相同tornado_name列的负号,则“值”中的负号会发生变化。
我尝试了几个嵌套if,但这对我来说很麻烦。任何帮助将不胜感激!
这是我的数据:
> dput(tornado_frame)
structure(list(tornado_names = structure(c(2L, 4L, 1L, 5L, 3L,
2L, 4L, 1L, 5L, 3L), .Label = c("expense per cluster", "node per cluster",
"revenue per cluster", "TB per node", "Total TB"), class = "factor"),
Level = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L
), .Label = c("high", "low"), class = "factor"), value = c(34.72222,
52.08333, 104.16667, -62.5, -52.08333, -20.83333, -41.66667,
-52.08333, 145.83333, 156.25)), .Names = c("tornado_names",
"Level", "value"), class = "data.frame", row.names = c(NA, -10L
))
答案 0 :(得分:3)
这是一个可能的data.table
解决方案
library(data.table)
setDT(df)[, value := if(diff(abs(value)) < 0) value * -1,
by = tornado_names]
df
# tornado_names Level value
# 1: node per cluster low 34.72222
# 2: TB per node low 52.08333
# 3: expense per cluster low 104.16667
# 4: Total TB low -62.50000
# 5: revenue per cluster low -52.08333
# 6: node per cluster high -20.83333
# 7: TB per node high -41.66667
# 8: expense per cluster high -52.08333
# 9: Total TB high 145.83333
# 10: revenue per cluster high 156.25000
这将检查您的条件是否为tornado_names
,并且仅更改满足条件的组内的值的符号。