我的数据类似于:
set.seed(1)
dt <- data.table(stock = c(rep("a",24),rep("b",24),rep("c",24),rep("d",24)),
hour = rep(1:24,4), day1 = sample(-5:5,96,replace = TRUE),
day2 = sample(-10:-1,96,replace = TRUE), day3 = sample(0:10,96,replace = TRUE),
day4 = 0)
我每天都会创建一个总计的列,并按如下方式创建一个总计每天所有库存的行:
dt[,Total_by_hour := rowSums(.SD), .SDcols = c("day1","day2","day3","day4")]
totals_row <- data.table(stock = "Total",hour = NA, t(colSums(dt[,!1:2])))
dt <- rbind(dt,totals_row)
看起来像:
stock hour day1 day2 day3 day4 Total_by_hour
a 1 -3 -6 1 0 -8
a 2 -1 -6 10 0 3
a 3 1 -2 3 0 2
...
d 22 4 -5 1 0 0
d 23 3 -3 3 0 3
d 24 3 -7 1 0 -3
Total 18 -507 426 0 -63
我想按&#34; Total_by_hour&#34;降序排序。柱。我还想根据最后一行&#34; Total&#34 ;,即设置按天排序的day1,day2,day3,day4列的列顺序。重新排序到第3天(共426个),第1天(共18个),第4天(共0个),第2天(共计-507个)。
我欢迎任何想法。非常感谢。
答案 0 :(得分:3)
您可以使用setcolorder
函数对data.table的行重新排序,并使用# Order by Total_by_hour descending
setorder(dt, -Total_by_hour)
函数对列进行排序:
> head(dt)
stock hour day1 day2 day3 day4 Total_by_hour
1: a 21 5 -3 8 0 10
2: c 20 3 -3 10 0 10
3: d 4 4 -2 8 0 10
4: a 8 2 -1 8 0 9
5: a 15 3 -1 6 0 8
6: d 5 4 -2 6 0 8
输出:
# Create a vector of the column names to reorder
cols_to_order <- paste0("day", 1:4)
# Get the order of the Total row for just these columns
reorder <- rev(order(dt[stock == "Total", cols_to_order, with = F]))
# Set the new column order
setcolorder(dt, neworder = c("stock", "hour", cols_to_order[reorder], "Total_by_hour"))
然后重新排序日期列:
> head(dt)
stock hour day3 day1 day4 day2 Total_by_hour
1: a 21 8 5 0 -3 10
2: c 20 10 3 0 -3 10
3: d 4 8 4 0 -2 10
4: a 8 8 2 0 -1 9
5: a 15 6 3 0 -1 8
6: d 5 6 4 0 -2 8
输出:
pd.get_dummies
答案 1 :(得分:2)
使用data.table的另一种方式
library(data.table)
setorder( dt, Total_by_hour)
setcolorder( dt, c(grep("day", colnames(dt), value = TRUE, invert = TRUE),
colnames( sort(dt[ nrow(dt), .SD, .SDcols = grep("day", colnames(dt)) ], decreasing = TRUE))))
head(dt)
# stock hour Total_by_hour day3 day1 day4 day2
# 1: Total NA -63 426 18 0 -507
# 2: a 10 -11 2 -5 0 -8
# 3: d 14 -11 1 -3 0 -9
# 4: b 23 -9 4 -5 0 -8
# 5: c 16 -9 1 -2 0 -8
# 6: c 23 -9 3 -2 0 -10
答案 2 :(得分:1)
使用dplyr: 首先,按最后一栏安排。
library(dplyr)
dt_1 <- dt %>% arrange(Total_by_hour)
现在,计算总数并相应地对列进行排序
dt_cols <- dt %>% select(contains("day")) %>% summarise_all(sum)
rank(dt_cols[1,])
columns_ordered <- c("stock", "hour",
c("day1","day2","day3","day4")[rank(dt_cols[1,])],
"Total_by_hour")
dt_2 <- dt_1[ , columns_ordered]
最后,再次添加“total”行:
totals_row <- data.table(stock = "Total",hour = NA, t(colSums(dt_2[,3:7])))
dt_2 <- rbind(dt_2,totals_row)