在R中格式化的数字看起来更好(千位分隔符等)不再是数字?

时间:2014-12-25 16:06:14

标签: r formatting numbers rstudio

首先让我开始讲述我是R的新手,让我们了解它。

假设我有一个数字向量:

> eve
 [1] 208999990 208999990 208999995 208999997 208999998 209499990 209499999 209999986 209999997 210000000 210000000 210000000 210000000 210000000 210000000
[16] 217000000 217998986 217998988 218000000 218500000 218999994 218999997 218999998 223999900 223999900 223999945 223999945 223999945 223999999 224199999
[31] 224199999 224199999 224199999 224200000 224799999 224999977 225100004 226998997 226998998 226998998 226999999 227000000 227000000 227000000 227000000
[46] 227000000 227399967 227700100 227798981 228199990 229199988 230000000 230278899 234388500 234388582 235000000 235999999 236388592 236388593 236388599
[61] 236388599 236388599 236388599 236388599 236388600 236388655 236989874 238388583 244000000 246992877 247992884 247997972 247997979 250000000 250000000
[76] 250000000 255000000 261000000 265000000 280000000 285000000

我正在使用formatC使它看起来更可读/漂亮并且实际上看到小数点后的数字(默认的R行为似乎隐藏了大数字的小数位?):

> eve<-formatC(eve, decimal.mark=",", big.mark=" ", digits = 2, format = "f")
> eve
 [1] "208 999 989,99" "208 999 990,00" "208 999 994,99" "208 999 997,00" "208 999 998,00" "209 499 989,99" "209 499 999,00" "209 999 985,99" "209 999 996,99"
[10] "209 999 999,89" "209 999 999,92" "209 999 999,93" "209 999 999,95" "209 999 999,98" "209 999 999,99" "216 999 999,97" "217 998 985,77" "217 998 987,55"
[19] "218 000 000,00" "218 500 000,00" "218 999 994,00" "218 999 997,00" "218 999 997,99" "223 999 900,00" "223 999 900,00" "223 999 944,65" "223 999 944,72"
[28] "223 999 944,95" "223 999 998,99" "224 199 998,59" "224 199 998,69" "224 199 998,77" "224 199 998,80" "224 199 999,93" "224 799 998,77" "224 999 976,99"
[37] "225 100 004,00" "226 998 996,78" "226 998 997,88" "226 998 997,99" "226 999 998,98" "227 000 000,00" "227 000 000,00" "227 000 000,00" "227 000 000,00"
[46] "227 000 000,00" "227 399 966,91" "227 700 099,99" "227 798 980,71" "228 199 990,00" "229 199 987,98" "230 000 000,00" "230 278 898,81" "234 388 500,00"
[55] "234 388 582,00" "235 000 000,00" "235 999 999,00" "236 388 591,91" "236 388 592,78" "236 388 598,93" "236 388 598,94" "236 388 598,95" "236 388 598,96"
[64] "236 388 598,97" "236 388 600,00" "236 388 655,00" "236 989 873,90" "238 388 582,81" "244 000 000,00" "246 992 877,00" "247 992 884,00" "247 997 972,00"
[73] "247 997 978,98" "249 999 999,99" "250 000 000,00" "250 000 000,00" "254 999 999,99" "261 000 000,00" "264 999 999,99" "280 000 000,00" "285 000 000,00"

问题在于我不再对该向量进行任何数值运算,因为它是character类型:

> class(eve)
[1] "character"
> typeof(eve)
[1] "character"

R中是否有一种方法可以保持数字以整齐的格式显示,并且仍然可以对它们进行数值运算?

我知道我可以在原始矢量上运行所有操作,只在需要时通过格式化功能显示格式化的值,但这对我来说似乎是浪费时间。在查看数字时,特别是大数字时,您经常看不到它们所代表的实际值,除非您计算确切的位数并且无法判断其中一个值是否大10倍,否则无法判断数字有多大较小,因为如果没有正确的格式,它就会变得模糊不清。

1 个答案:

答案 0 :(得分:3)

您需要了解对象的内部表示与打印方式之间的区别。

我并不同意在这里定义课程是过度的:

num <- c(208999990, 308999990, 408999995)
class(num) <- c("niceprint", class(num))

print.niceprint <- function(x, decimal.mark=",", big.mark=" ", digits = 2, ...) {
  print(formatC(unclass(x), decimal.mark=decimal.mark, big.mark=big.mark, digits = digits, format = "f"))
}

#for printing in data.frames
format.niceprint <- function(x, decimal.mark=",", big.mark=" ", digits = 2, ...) {
  formatC(unclass(x), decimal.mark=decimal.mark, big.mark=big.mark, digits = digits, format = "f")
}

num
#[1] "208 999 990,00" "308 999 990,00" "408 999 995,00"

data.frame(x=num, y=2*num)
#                 x                y
#1 208 999 990,0000 417 999 980,0000
#2 308 999 990,0000 617 999 980,0000
#3 408 999 995,0000 817 999 990,0000

#a matrix
t(num)
#     [,1]             [,2]             [,3]            
#[1,] "208 999 990,00" "308 999 990,00" "408 999 995,00"