在用于绘制直方图的hist
函数中,有一个参数include.lowest
,其默认值为TRUE。
根据我的理解,当断点被设置为向量时,该参数应该允许保持或不保持最低断点的最低界限。
但是,如果我作为一个纯粹的人为例子尝试像下面这样的命令:
hist(c(1:100), breaks=c(1,2,10,50,100), include.lowest=FALSE)
我刚收到错误:
Error in hist.default(c(1:100), breaks = c(1, 2, 10, 50, 100), include.lowest = FALSE) :
some 'x' not counted; maybe 'breaks' do not span range of 'x'
这里发生的是,hist不允许绘制不考虑完整数据(x)的图。如果include.lowest为false,则值为" 1"来自我的数据不会出现在直方图中的任何位置。但既然如此,那么include.lowest用于什么?我无法看到任何将其设置为false的情况会产生任何差异而不会触发错误。
注意:在我的解释中,我假设我保留默认right=TRUE
,但如果right=FALSE
,我应该是最高中断而不是最低中的相同行为,对吧?所以我认为它不会改变任何东西。
更多上下文:我们正在开发一个图形界面,用于使用R绘制图形(它将成为R ++的一部分,当然它会变得非常棒)。当我们为所有直方图参数提供工具时,我们就陷入了困境。如果它对任何东西都没用,并且只是一些旧的hist版本的遗产,我们也可能不包括它,但如果它真的有用,我们就不想忘记它。
感谢大家的关注。
答案 0 :(得分:0)
我不确定你在问什么。我假设您询问include.lowest = FALSE
中hist
的行为,以及为什么它会在您的示例中产生错误。
这与数据分箱的方式有关。我们来看看cut
,因为此函数与hist
的作用密切相关。
cut(1:100, breaks = c(1, 2, 10, 50, 100))
# [1] <NA> (1,2] (2,10] (2,10] (2,10] (2,10] (2,10] (2,10]
# [9] (2,10] (2,10] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50]
# [17] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50]
# [25] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50]
# [33] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50]
# [41] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50] (10,50]
# [49] (10,50] (10,50] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100]
# [57] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100]
# [65] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100]
# [73] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100]
# [81] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100]
# [89] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100] (50,100]
# [97] (50,100] (50,100] (50,100] (50,100]
#Levels: (1,2] (2,10] (10,50] (50,100]
注意1如何在“NA
”中“放置”。那是因为箱子是开放 - 封闭的间隔,例如(1, 2]
表示1
已被排除,而2
已包含。
回到hist
,以下内容在使用include.lowest = FALSE
hist(1:100, breaks = c(0, 2, 10, 50, 100), include.lowest = FALSE)
澄清(基于@ MikkoMarttila的评论):在hist
中使用include.lowest = FALSE
进行分区是您在R中使用标准分箱的默认行为,例如cut
。因此,包含设置include.lowest = FALSE
的选项与cut
及其默认的开闭时间间隔保持一致。大多数情况下,在绘制直方图时,您需要一个间隔,其中最小值是间隔的一部分(使用开闭时间间隔时不是这种情况),因此默认为include.lowest = TRUE
。