我有一个带有三个变量的数据框;一个(“组”)是一个具有两个级别的因子,一个(“单词”)是一个字符向量,一个(“持续时间”)是数字。例如:
DATA <- data.frame(
group = c(rep("prefinal",10), rep("final", 10)),
word = c(sample(LETTERS[1:5], 10, replace = T), sample(LETTERS[1:5], 10, replace = T)),
duration = rnorm(20)
)
DATA
group word duration
1 prefinal C 0.16378771
2 prefinal E 0.13370196
3 prefinal A 0.69112398
4 prefinal B 0.21499187
5 prefinal D -0.28998279
6 prefinal D -2.00353522
7 prefinal A 0.37842555
8 prefinal E 1.62326170
9 prefinal A -0.26294929
10 prefinal B -0.54276322
11 final D 1.32772171
12 final E -1.84902285
13 final C 0.01058158
14 final E 1.49529743
15 final B 0.55291290
16 final A -0.35484820
17 final D -0.16822110
18 final A 0.88667458
19 final E 0.70889916
20 final B 1.12217332
我想在方框图中按组描述单词的持续时间:
boxplot(DATA$duration ~ DATA$group + DATA$word,
xaxt="n",
col = rep(c("blue", "red"), 5))
axis(1, at = seq(from=1.5, to= 10.5, by=2), labels = sort(unique(DATA$word)), cex.axis = 0.9)
R似乎默认情况下按字母顺序(“单词”变量的顺序)对框进行排序。
编辑:
答案 0 :(得分:1)
您可以根据DATA$word
的中位数对它们进行重新排序。 -
之前的DATA$duration
是按降序排序。
DATA$word <- reorder(DATA$word, -DATA$duration, FUN = median)
boxplot(DATA$duration ~ DATA$group + DATA$word,
xaxt="n",
col = rep(c("blue", "red"), 5))
axis(1, at = seq(from=1.5, to= 10.5, by=2), labels = levels(DATA$word), cex.axis = 0.9)
您可以对prefinal
的子组执行相同的操作。但这需要额外的步骤:
ordered_levels <- levels(with(DATA[DATA$group == "prefinal",], reorder(word, -duration, FUN = median)))
DATA$word <- factor(DATA$word, levels = ordered_levels)