Question

我曾向R-core报告此事，但是他们说（没有解释）这不是R中的错误：

在自动处理某些数据期间，我遇到了一个空数据集（或类似数据集）。无论如何，使用的hist()函数抛出了一个错误，这对我来说似乎是一个语法错误（我是一个R初学者）：

> df <- data.frame(n=c(0))
> str(df)
'data.frame':    1 obs. of  1 variable:
$ n: num 0
> hist(df$n) ### this one works!
> hist(df$n, nclass=nclass.scott)  ### this does not!
Error in if (h > 0) ceiling(diff(range(x))/h) else 1L :
 missing value where TRUE/FALSE needed
> df <- data.frame(n=c(0,1))
> hist(df$n, nclass=nclass.scott) ### this one works

测试的版本：3.3.1（linux）和3.3.3（Windows）

没有nclass=nclass.scott我没有收到错误。但是，我找不到这个参数的文档;我刚刚发现使用此参数的直方图对我来说更具吸引力。通过Google，我发现：“nclass.scott根据标准误差的估计使用Scott选择正态分布，除非它返回1时为零”

我也期待一些稳健性：在自动处理中你永远不会知道特定集合将拥有多少数据，在这种情况下我更喜欢带有单个条形图的直方图。还要比较这些：

> hist(numeric(0))
Error in hist.default(numeric(0)) : invalid number of 'breaks'
> hist(numeric(1))
> hist(numeric(1), nclass=nclass.scott)
Error in if (h > 0) ceiling(diff(range(x))/h) else 1L : missing value where TRUE/FALSE needed
> hist(numeric(0), nclass=nclass.scott)
Error in if (h > 0) ceiling(diff(range(x))/h) else 1L : missing value where TRUE/FALSE needed

Answer 1

仅使用一个观察值无法估计标准误差，并且在这种情况下返回NA，这解释了有关缺失值的错误消息。

> sd(0)
[1] NA

> sd(c(1,1))
[1] 0

Answer 2

似乎最好的解决方案（就像现在的情况一样）是（将罗兰与我所拥有的结合起来）：

if (length(df$n) > 1L) {
    hist(df$n, breaks=if (length(df$n) == 1L) 1L else nclass.scott)
} # else produce nothing

为什么`hist（...，nclass = nclass.scott）`在R中失败？

2 个答案: