大小爆炸文件与字符串

时间:2016-10-20 15:33:32

标签: string common-lisp sbcl

我有一个261MB的文本文件(xdebug输出),当我读取它时,占用额外的2GB空间动态空间。

(defun stream->string (tmp-stream)
  (do ((line (read-line tmp-stream nil nil)
             (read-line tmp-stream nil nil))
       (lines nil))
      ((not line) (progn 
                    (FORMAT T "COLLECTED~%")
                    (FORMAT nil "~{~a~^~%~}" (reverse lines))))
    (push line lines)))


(defparameter *test* nil)

  (progn
    (setf *test* nil)
    (sb-ext:gc :full t)
    (room)
    (FORMAT T "----~%")
    (with-open-file (stream "/home/.../debugFiles/xdebug_1.xt")
      (room)
      (FORMAT T "----~%")
      (setf *test* (stream->string stream))
      (sb-ext:gc :full t)
      (room)
      (FORMAT T "----~%"))
    (sb-ext:gc :full t)
    (room))  

输出

Dynamic space usage is:   84,598,224 bytes.
Read-only space usage is:      5,856 bytes.
Static space usage is:         4,160 bytes.
Control stack usage is:        8,408 bytes.
Binding stack usage is:        1,072 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.

Breakdown for dynamic space:
  20,841,808 bytes for    20,691 code objects.
  15,989,600 bytes for   999,350 cons objects.
  14,532,960 bytes for   118,880 simple-vector objects.
  13,951,792 bytes for   168,301 instance objects.
   5,994,864 bytes for    41,648 simple-character-string objects.
  13,287,200 bytes for   215,901 other objects.
  84,598,224 bytes for 1,564,771 dynamic objects (space total.)
----
Dynamic space usage is:   85,346,752 bytes.
Read-only space usage is:      5,856 bytes.
Static space usage is:         4,160 bytes.
Control stack usage is:        8,536 bytes.
Binding stack usage is:        1,072 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.

Breakdown for dynamic space:
  20,842,928 bytes for    20,692 code objects.
  16,125,008 bytes for 1,007,813 cons objects.
  14,698,784 bytes for   120,834 simple-vector objects.
  14,239,440 bytes for   171,411 instance objects.
   6,014,144 bytes for    41,776 simple-character-string objects.
  13,426,448 bytes for   219,723 other objects.
  85,346,752 bytes for 1,582,249 dynamic objects (space total.)
----
COLLECTED
Dynamic space usage is:   2,557,851,296 bytes.
Read-only space usage is:      5,856 bytes.
Static space usage is:         4,160 bytes.
Control stack usage is:        8,536 bytes.
Binding stack usage is:        1,072 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.

Breakdown for dynamic space:
  2,466,544,480 bytes for   817,255 simple-character-string objects.
  91,306,816 bytes for 2,303,370 other objects.
  2,557,851,296 bytes for 3,120,625 dynamic objects (space total.)
----
Dynamic space usage is:   1,131,069,056 bytes.
Read-only space usage is:      5,856 bytes.
Static space usage is:         4,160 bytes.
Control stack usage is:        8,360 bytes.
Binding stack usage is:        1,072 bytes.
Control and binding stack usage is for the current thread only.
Garbage collection is currently enabled.

Breakdown for dynamic space:
  1,053,183,424 bytes for    41,547 simple-character-string objects.
  77,885,632 bytes for 1,510,521 other objects.
  1,131,069,056 bytes for 1,552,068 dynamic objects (space total.)

我能理解三倍的大小(即使这仍然让我感到惊讶):

  1. 行集合
  2. 格式
  3. 创建的字符串对象
  4. 保存在*test*
  5. 中的字符串

    但是,增加10倍是很重要的。

    怎么可能?

0 个答案:

没有答案