Question

我在基于UNIX的系统上运行R脚本，该脚本包含大数字的乘法，因此结果表示NAs按整数溢出，但是当我在Windows上运行相同的脚本时，不会出现此问题。

但是我应该让脚本整夜在桌面上运行（这是Unix）。

这个问题有什么解决方案吗？

感谢

for(ol in seq(1,nrow(yi),by=25))
    {
    for(oh in seq(1,nrow(yi),by=25))
 {

        A=(N*(ol^2)) + ((N*(N+1)*(2*N+1))/6) -(2*ol*((N*N+1)/2)) + (2*N*ol*(N-oh+1)) + ((N-oh+1)*N^2) + (2*N*(oh-N-1)*(oh+N))


}
}

    with :
N=16569 = nrow(yi)

但是第一轮没有在unix上计算。

Answer 1

正如其他答案所指出的那样，到目前为止，你的结果还有一些不可复制/奇怪的东西。然而，如果你真的必须对大整数进行精确计算，你可能需要R和其他系统之间的接口。

您的一些选择是：

gmp包（请参阅this page并向下滚动至R
bc计算器on googlecode
R维基上有一个high precision arithmetic页面，用于比较Yacas，bc和MPFR / GMP的接口
elliptical包中的PARI / GP包有一个有限的接口，但这可能（很多）不如前三个选项那么有用

大多数Unix或Cygwin系统都应该已经安装了bc。 GMP和Yacas易于在现代Linux系统上安装......

这是一个扩展示例，其函数可以在数值，整数或bigz计算中进行选择。

f1 <- function(ol=1L,oh=1L,N=16569L,type=c("num","int","bigz")) {
  type <- match.arg(type)
  ## convert all values to appropriate type
  if (type=="int") {
    ol <- as.integer(ol)
    oh <- as.integer(oh)
    N <- as.integer(N)
    one <- 1L
    two <- 2L
    six <- 6L
    cc <- as.integer
  } else if (type=="bigz") {
    one <- as.bigz(1)
    two <- as.bigz(2)
    six <- as.bigz(6)
    N <- as.bigz(N)
    ol <- as.bigz(ol)
    oh <- as.bigz(oh)
    cc <- as.bigz
  } else {
    one <- 1
    two <- 2
    six <- 6
    N <- as.numeric(N)
    oh <- as.numeric(oh)
    ol <- as.numeric(ol)
    cc <- as.numeric
  }
  ## if using bigz mode, the ratio needs to be converted back to bigz;
  ## defining cc() as above seemed to be the most transparent way to do it
  N*ol^two + cc(N*(N+one)*(two*N+one)/six) -
    ol*(N*N+one) + two*N*ol*(N-oh+one) +
      (N-oh+one)*N^two + two*N*(oh-N-one)*(oh+N)
}

我删除了很多不必要的括号，这实际上让人更难以看到发生了什么。确实，对于（1,1）情况，最终结果不大于.Machine$integer.max，但是一些中间步骤是......（对于（1,1）情况，这实际上减少到$$ -1 / 6 *（N + 2）*（4 * N ^ 2-5 * N + 3）$$ ......）

f1()  ##  -3.032615e+12
f1() > .Machine$integer.max  ## FALSE
N <- 16569L
N*(N+1)*(2*N+1) > .Machine$integer.max   ## TRUE
N*(N+1L)*(2L*N+1L)  ## integer overflow (NA)
f1(type="int")      ## integer overflow
f1(type="bigz")     ##  "-3032615078557"
print(f1(),digits=20)  ##  -3032615078557: no actual loss of precision in this case

PS：你的等式中有一个(N*N+1)项。那应该是N*(N+1)，还是你真的是N^2+1？

Answer 2

鉴于你的意见，我猜你严重误解了R中数字的“正确性”。你说你在Windows上得到的结果就像-30598395869593930593。现在，在32位和64位上，使用双精度甚至无法实现精度，更不用说使用整数：

> x <- -30598395869593930593
> format(x,scientific=F)
[1] "-30598395869593931776"
> all.equal(x,as.numeric(format(x,scientific=F)))
[1] TRUE
> as.integer(x)
[1] NA

你有16个数字可以信任，其余的都是bollocks。再说一次，16位数的准确度已经非常强了。大多数测量工具甚至都没有接近。

Answer 3

你可以将整数转换为浮点数，以便使用浮点数学进行计算吗？

例如：

> x=as.integer(1000000)
> x*x
[1] NA
Warning message:
In x * x : NAs produced by integer overflow
> x=as.numeric(1000000)
> x*x
[1] 1e+12

顺便说一下，为什么警告会出现在一个环境而不是另一个环境中并不完全清楚。我首先认为R的32位和64位版本可能分别使用32位和64位整数，但doesn't appear to be the case。根据警告的显示方式，您的环境是否都配置相同？

在linux上由整数溢出+ R生成的NA

3 个答案: