在示例数据框中:
dframe <- structure(list(id = c(14768L, 18180L), col1 = c(-0.6084, -0.3227
), col2 = c(-1.4887, -1.1797), col3 = c(3.8402, 3.0491), col4 = c(-1.8265,
-1.3248), col5 = c(0.4078, 0.7862), col1_new = c(-0.4582, -0.2094
), col2_new = c(-1.3878, -1.5926), col3_new = c(3.3112, 3.2756
), col4_new = c(-1.6242, -1.2361), col5_new = c(0.5014, 0.5925
)), class = "data.frame", row.names = c(NA, -2L))
如何检测所有唯一的id的唯一ID,即每个对col名称(即col1-col1_new)的更改是否增加或减少
答案 0 :(得分:1)
首先,使用data.table::melt
转换为更有用的长格式会有所帮助。然后,我们可以简单地计算出差异,并添加一个新列以指示增加或减少。
library(data.table)
library(tidyverse)
setDT(dframe)
# take each pair of original and new values, and move each pair from columns
# to their own row (with two columns, "current" and "new"
dframe_new <- melt(dframe, id.var = "id", measure = patterns("\\d$", "new$"),
value.name = c("current", "new")) %>%
mutate(
diff = new-current,
Change = case_when(
diff > 0 ~ "Increase",
diff == 0 ~ "No Change",
TRUE ~ "Decrease"
)
)
id variable current new diff Change
1 14768 1 -0.6084 -0.4582 0.1502 Increase
2 18180 1 -0.3227 -0.2094 0.1133 Increase
3 14768 2 -1.4887 -1.3878 0.1009 Increase
4 18180 2 -1.1797 -1.5926 -0.4129 Decrease
5 14768 3 3.8402 3.3112 -0.5290 Decrease
6 18180 3 3.0491 3.2756 0.2265 Increase
7 14768 4 -1.8265 -1.6242 0.2023 Increase
8 18180 4 -1.3248 -1.2361 0.0887 Increase
9 14768 5 0.4078 0.5014 0.0936 Increase
10 18180 5 0.7862 0.5925 -0.1937 Decrease
一个简单的情节可能是:
ggplot(dframe_new, aes(factor(id), diff, color = Change)) + geom_point()