我想用R来制作一系列按中值排序的箱形图。假设我执行:
boxplot(cost ~ type)
这会给我一些箱形图,成本显示在y轴上,类型类别在x轴上可见:
----- -----
| |
[ ] |
| [ ]
| |
----- -----
A B
然而,我想要的是从最高到最低中值排序的箱线图。我怀疑的是,我需要做的是更改类型(A或B)的标签,以数字方式指示哪个是最低和最高的中值,但我想知道是否有更聪明的方法来解决问题。
答案 0 :(得分:48)
结帐?reorder
。这个例子似乎是你想要的,但是以相反的顺序排序。我在下面的第一行更改了-count
,按照您想要的顺序排序。
bymedian <- with(InsectSprays, reorder(spray, -count, median))
boxplot(count ~ bymedian, data = InsectSprays,
xlab = "Type of spray", ylab = "Insect count",
main = "InsectSprays data", varwidth = TRUE,
col = "lightgray")
答案 1 :(得分:12)
是的,这就是想法:
> set.seed(42) # fix seed
> DF <- data.frame(type=sample(LETTERS[1:5], 100, replace=TRUE),
+ cost=rnorm(100))
>
> boxplot(cost ~ type, data=DF) # not ordered by median
>
> # compute index of ordered 'cost factor' and reassign
> oind <- order(as.numeric(by(DF$cost, DF$type, median)))
> DF$type <- ordered(DF$type, levels=levels(DF$type)[oind])
>
> boxplot(cost ~ type, data=DF) # now it is ordered by median
答案 2 :(得分:0)
小心缺少值,您必须添加na.rm = TRUE
才能正常工作。如果没有,代码根本不起作用。花了我几个小时才找到了。
bymedian <- with(InsectSprays, reorder(spray, -count, median, **na.rm = TRUE**)
boxplot(count ~ bymedian, data = InsectSprays,
xlab = "Type of spray", ylab = "Insect count",
main = "InsectSprays data", varwidth = TRUE,
col = "lightgray")