Question

我想使用Rs hist函数来获取bin计数。因为我不知道我使用的最低或最高值-Inf和Inf用于第一次和最后一次休息。但不是计算-Inf到第一个中断，而最后一个中断到Inf R将所有值放在第一个bin中。

> hist(1:100, breaks=c(0, 50, 100), plot=F)$counts
[1] 50 50
> hist(1:100, breaks=c(-Inf, 50, 100), plot=F)$counts
[1] 100   0
> hist(1:100, breaks=c(0, 50, Inf), plot=F)$counts
[1] 100   0
> hist(1:100, breaks=c(-Inf, 50, Inf), plot=F)$counts
[1] 100   0

我希望所有四行都能提供相同的输出但不会。这是预期的行为吗？这个问题有没有简单的解决方法？

编辑：我最终使用了表格并改为：

table(cut(1:100, breaks=c(-Inf, 50, Inf)))

Answer 1

它会导致hist出现问题，因为块的宽度变为无限，默认情况下hist会考虑计算中块的区域：

非等距间隔的默认值是给出区域1的图，其中矩形的面积是数据点的一部分掉进牢房里。

最好使用break参数的单值版本：要使用的中断数。默认情况下，它会为您的数据选择合理的中断：

str(hist(1:100, breaks=3, plot=F))
List of 6
 $ breaks  : num [1:3] 0 50 100
 $ counts  : int [1:2] 50 50
 $ density : num [1:2] 0.01 0.01
 $ mids    : num [1:2] 25 75
 $ xname   : chr "1:100"
 $ equidist: logi TRUE
 - attr(*, "class")= chr "histogram"

Inf在行为断裂中的行为，R

1 个答案: