我正在编写代码来制作pdf(当然是来自postscript),并且我已尽力遵循规范。但是imagemagick的identify
说我的外部参照表有问题。
任何人都可以看到我的问题在哪里/什么?
$ echo quit | gsnd -q pw.ps dancingmen.ps | identify -
**** Warning: An error occurred while reading an XREF table.
**** The file has been damaged. This may have been caused
**** by a problem while converting or transfering the file.
**** Ghostscript will attempt to recover the data.
**** This file had errors that were repaired or ignored.
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
**** specification.
-=>/tmp/magick-16940kBciKvHuOrD3 PBM 612x792 612x792+0+0 16-bit Bilevel Gray 61KB 0.000u 0:00.000
我的pdf(在Linux上用ghostscript制作,单个LF eols):
%PDF-1.3
1 0 obj
<< /Type /Catalog
/Pages 2 0 R
>>
endobj
2 0 obj
<< /Kids [ 3 0 R ]
/Type /Pages
/Count 1
>>
endobj
3 0 obj
<< /Contents [ 4 0 R ]
/MediaBox [ 0.0 0.0 612.0 792.0 ]
/Type /Page
/Parent 2 0 R
>>
endobj
4 0 obj
<< /Length 1287
>>
stream
2.0 4.0 m 2.0 3.9 l 2.05516 3.9 2.1 3.94484 2.1 4.0 c 2.1 4.05516 2.05516 4.1 2.0 4.1 c 1.94484 4.1 1.9 4.05516 1.9 4.0 c 1.9 3.94484 1.94484 3.9 2.0 3.9 c f 2.0 3.6 m 2.5 3.1 l S -2.0 3.6 m -1.5 3.1 l S 2.0 3.1 m 2.4 2.8 l 2.1 2.4 l 2.2 2.35 l S -2.0 3.1 m -1.7 2.6 l -1.5 2.8 l S 2.0 3.9 m 2.0 3.6 l 2.0 3.1 l S 3.0 4.0 m 3.0 3.9 l 3.05516 3.9 3.1 3.94484 3.1 4.0 c 3.1 4.05516 3.05516 4.1 3.0 4.1 c 2.94484 4.1 2.9 4.05516 2.9 4.0 c 2.9 3.94484 2.94484 3.9 3.0 3.9 c f 3.0 3.6 m 3.5 3.1 l S -3.0 3.6 m -2.5 4.1 l S 3.0 3.1 m 3.0 2.3 l 3.15 2.3 l S -3.0 3.1 m -3.0 2.3 l -2.85 2.3 l S 3.0 3.9 m 3.0 3.6 l 3.0 3.1 l S 4.0 4.0 m 4.0 3.9 l 4.05516 3.9 4.1 3.94484 4.1 4.0 c 4.1 4.05516 4.05516 4.1 4.0 4.1 c 3.94484 4.1 3.9 4.05516 3.9 4.0 c 3.9 3.94484 3.94484 3.9 4.0 3.9 c f 4.0 3.6 m 4.5 4.1 l S -4.0 3.6 m -3.5 4.1 l S 4.0 3.1 m 4.3 2.6 l 4.5 2.8 l S -4.0 3.1 m -3.7 2.6 l -3.5 2.8 l S 4.0 3.9 m 4.0 3.6 l 4.0 3.1 l S 5.0 4.0 m 5.0 3.9 l 5.05516 3.9 5.1 3.94484 5.1 4.0 c 5.1 4.05516 5.05516 4.1 5.0 4.1 c 4.94484 4.1 4.9 4.05516 4.9 4.0 c 4.9 3.94484 4.94484 3.9 5.0 3.9 c f 5.0 3.6 m 5.5 4.1 l 5.5 4.3 l 5.6 4.3 l 5.6 4.2 l 5.5 4.2 l S -5.0 3.6 m -4.5 3.1 l S 5.0 3.1 m 5.4 2.8 l 5.1 2.4 l 5.2 2.35 l S -5.0 3.1 m -4.6 2.8 l -4.9 2.4 l -4.8 2.35 l S 5.0 3.9 m 5.0 3.6 l 5.0 3.1 l S
endstream
endobj
xref
0 4
0000000000 65535 f
0000000010 00000 n
0000000063 00000 n
0000000127 00000 n
0000000234 00000 n
trailer
<<
/Root 1 0 R
/Size 4
>>
startxref
1581
%%EOF
作为参考,这是正在转换的postscript drawing。
更新:我已修复了上述几个问题:缺少xref
个关键字,%%EOF
而不是$$EOF
。来自identify
的错误相同,但Chrome浏览器的查看器实际上向我显示了一张图片(非常小,位于左下角,因为我还没有处理图形状态)。
link to newer file with single content stream
ghostscript的输出:
$ echo pstack quit | gsnd -q data/pw.ps data/dancingmen.ps | gsnd -sDEVICE=ps2write -dPDFDEBUG -dPDFSTOPONERROR -
GPL Ghostscript 9.18 (2015-10-05)
Copyright (C) 2015 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
**** Warning: An error occurred while reading an XREF table.
**** The file has been damaged. This may have been caused
**** by a problem while converting or transfering the file.
**** Ghostscript will attempt to recover the data.
<<
/Root 1 0 R
/Size 4 >>
%Resolving: [1 0]
<<
/Type /Catalog /Pages 2 0 R
>>
endobj
%Resolving: [2 0]
<<
/Kids [
3 0 R
]
/Type /Pages /Count 1 >>
endobj
%Resolving: [3 0]
<<
/Contents [
4 0 R
]
/MediaBox [
0.0 0.0 612.0 792.0 ]
/Type /Page /Parent 2 0 R
>>
endobj
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [1 0]
%Resolving: [1 0]
%Resolving: [1 0]
%Resolving: [1 0]
%Resolving: [2 0]
Processing pages 1 through 1.
Page 1
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [3 0]
%Resolving: [3 0]
%Resolving: [3 0]
%Resolving: [3 0]
%Resolving: [3 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [1 0]
%Resolving: [2 0]
%Resolving: [4 0]
<<
/Length 1288 >>
stream
%FilePosition: 270
endobj
2.0 4.0 m
2.0 3.9 l
2.05516 3.9 2.1 3.94484 2.1 4.0 c
2.1 4.05516 2.05516 4.1 2.0 4.1 c
1.94484 4.1 1.9 4.05516 1.9 4.0 c
1.9 3.94484 1.94484 3.9 2.0 3.9 c
f
2.0 3.6 m
2.5 3.1 l
S
-2.0 3.6 m
-1.5 3.1 l
S
2.0 3.1 m
2.4 2.8 l
2.1 2.4 l
2.2 2.35 l
S
-2.0 3.1 m
-1.7 2.6 l
-1.5 2.8 l
S
2.0 3.9 m
2.0 3.6 l
2.0 3.1 l
S
3.0 4.0 m
3.0 3.9 l
3.05516 3.9 3.1 3.94484 3.1 4.0 c
3.1 4.05516 3.05516 4.1 3.0 4.1 c
2.94484 4.1 2.9 4.05516 2.9 4.0 c
2.9 3.94484 2.94484 3.9 3.0 3.9 c
f
3.0 3.6 m
3.5 3.1 l
S
-3.0 3.6 m
-2.5 4.1 l
S
3.0 3.1 m
3.0 2.3 l
3.15 2.3 l
S
-3.0 3.1 m
-3.0 2.3 l
-2.85 2.3 l
S
3.0 3.9 m
3.0 3.6 l
3.0 3.1 l
S
4.0 4.0 m
4.0 3.9 l
4.05516 3.9 4.1 3.94484 4.1 4.0 c
4.1 4.05516 4.05516 4.1 4.0 4.1 c
3.94484 4.1 3.9 4.05516 3.9 4.0 c
3.9 3.94484 3.94484 3.9 4.0 3.9 c
f
4.0 3.6 m
4.5 4.1 l
S
-4.0 3.6 m
-3.5 4.1 l
S
4.0 3.1 m
4.3 2.6 l
4.5 2.8 l
S
-4.0 3.1 m
-3.7 2.6 l
-3.5 2.8 l
S
4.0 3.9 m
4.0 3.6 l
4.0 3.1 l
S
5.0 4.0 m
5.0 3.9 l
5.05516 3.9 5.1 3.94484 5.1 4.0 c
5.1 4.05516 5.05516 4.1 5.0 4.1 c
4.94484 4.1 4.9 4.05516 4.9 4.0 c
4.9 3.94484 4.94484 3.9 5.0 3.9 c
f
5.0 3.6 m
5.5 4.1 l
5.5 4.3 l
5.6 4.3 l
5.6 4.2 l
5.5 4.2 l
S
-5.0 3.6 m
-4.5 3.1 l
S
5.0 3.1 m
5.4 2.8 l
5.1 2.4 l
5.2 2.35 l
S
-5.0 3.1 m
-4.6 2.8 l
-4.9 2.4 l
-4.8 2.35 l
S
5.0 3.9 m
5.0 3.6 l
5.0 3.1 l
S
**** This file had errors that were repaired or ignored.
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
**** specification.
%Resolving: [2 0]
%Resolving: [1 0]
更新:叹息。如果我展示代码,我认为它是最好的。该程序旨在挂钩postscript和捕获路径的某些绘图操作符,并生成内容的pdf文件。我暂时忽略了输出的质量,特别是转换矩阵。
/prompt {} def
<<
/.create-pdf-data { % called at start
install-operator-overrides
}
/.create-pdf-page { % called at showpage
1 /PageNumber +=
<< /Type /Page
/Parent pdf-object-names /Pages get create-ref
/MediaBox [gsave newpath clippath pathbbox grestore]
/Contents []
>>
current-page-name dup 3 1 roll create-object
pdf-object-names exch get create-ref add-to-pages-kids
[ display-list {
exch pop
create-content-stream
} for-each ]
{ ( ) exch strcat strcat } reduce
add-content-to-page
}
/current-page-name {
(Page) PageNumber as-string strcat
}
/current-page {
pdf-objects pdf-object-names current-page-name get get
}
/.output-pdf { % called at quit
/OutputFileName where { pop OutputFileName }{ (%stdout) } ifelse
(w) file write-pdf
pstack
}
/operator-overrides <<
%/start .create-pdf-data
/stroke ({ mark-path /S cvx ] display //super//call })
/fill ({ mark-path /f cvx ] display //super//call })
/showpage ({ .create-pdf-page //super//call })
/quit ({ .output-pdf //super//call })
>>
/install-operator-overrides {
operator-overrides {
1 index load
dup /super exch def
type /arraytype eq { /exec load }{ /dummyproc cvx } ifelse
/call exch def
cvx exec userdict 3 1 roll put
} forall
userdict /dummyproc {} put
}
/PageNumber 0
/+= { dup load 3 2 roll add store }
/write-pdf {
/f exch def
(1.3) write-header
write-body
write-xref-table
write-trailer
}
/pdf-output-file-position 0
/write-header {
/pdf-output-file-position 0 store
(%PDF-) .w .w \n \n
}
/write-body {
write-objects-and-save-positions
}
/write-objects-and-save-positions {
pdf-objects {
1 index save-position
write-object
} for-each
}
/write-xref-table {
(xref) .w \n
pdf-output-file-position /xref-position exch def
(0 ) .w pdf-object-positions length 1 sub .n \n
0 format-10 .w ( 65535 f ) .w \n
pdf-object-positions {
write-xref-table-row
} for-each
}
/write-xref-table-row {
exch pop format-10 .w
( 00000 n ) .w \n
}
/format-10-string 20 string
/format-10 {
format-10-string cvs
(0000000000) 0 10 3 index length sub getinterval
exch strcat
}
/write-trailer {
(trailer) .w \n
(<<) .w \n
( /Root 1 0 R) .w \n
( /Size ) .w pdf-objects length 1 sub .n \n
(>>) .w \n
(startxref) .w \n
xref-position .n \n
(%%EOF) .w \n
}
/create-content-stream {
to-string-with-spaces
%dup length ==only ( ) print ==
}
/write-object {
exch .n ( 0 obj) .w \n
dup write-dict
pdf-streams exch 2 copy known { write-stream }{ pop pop } ifelse
(endobj) .w \n \n
}
/write-stream {
(stream) .w \n
get .w \n
(endstream) .w \n
}
/write-dict {
(<< ) .w
{ exch write-thing write-thing \n } forall
(>> ) .w \n
}
/write-thing {
+is-ref { write-ref }{
+is-name { write-name }{
+is-array { write-array }{
+is-null { pop (null ) .w }{
.n ( ) .w
} ifelse } ifelse } ifelse } ifelse
}
/write-ref {
ref .n ( 0 R ) .w
}
/write-name {
dup xcheck not { (/) .w } if
.n ( ) .w
}
/write-array {
([ ) .w
{ write-thing } forall
(] ) .w
}
/+is-ref { dup is-ref }
/+is-name { dup is-name }
/+is-array { dup is-array }
/+is-null { dup is-null }
/is-string { type /stringtype eq }
/is-array { type /arraytype eq }
/is-name { type /nametype eq }
/is-null { type /nulltype eq }
/is-ref { +is-name { is-ref-format }{ pop false } ifelse }
/is-ref-format { ref-check-string cvs 0 1 getinterval (&) eq }
/ref-check-string 20 string
/ref { 10 string cvs rest cvi }
/create-ref { (&) exch 10 string cvs strcat cvn }
/mark-path { [ { /m } { /l } { /c } { /h } pathforall }
/display { add-to-display-list }
/display-list <<
0 null
>>
/add-to-display-list { display-list dup 3 1 roll length exch put }
/clear-display-list { /display-list << 0 null >> store }
/pdf-objects << % integer keys
0 null
1 << /Type /Catalog /Pages /&2 >>
2 << /Type /Pages /Kids [] /Count 0 >>
>>
/pdf-object-names << % integer values
/Catalog 1
/Pages 2
>>
/pdf-object-positions << % integer keys
0 null
>>
/pdf-streams <<
>>
/create-object { % dict name
exch pdf-objects dup length 3 2 roll put
pdf-object-names exch pdf-objects length 1 sub put
}
/object { % name -> dict
pdf-object-names exch get pdf-objects exch get
}
/save-position {
pdf-object-positions exch pdf-output-file-position put
}
/Pages {
pdf-objects pdf-object-names /Pages get get
}
/add-content-to-page {
<<
/Length 2 index length 1 add
>> dup 3 2 roll pdf-streams 3 1 roll put
/current-content create-object
pdf-object-names /current-content get create-ref
current-page /Contents 2 copy get [ exch {}forall counttomark 4 add -1 roll ] put
}
/add-to-pages-kids { % ref
Pages /Kids 2 copy get [ exch {}forall counttomark 4 add -1 roll ] put
Pages /Count 2 copy get 1 add put
}
/.w { f exch dup length /pdf-output-file-position += writestring }
/.n { dup is-string not { .n-string cvs } if .w }
/.n-string 100 string
/\n { (\n) .w }
/to-string-with-spaces { {as-string} map {( ) exch strcat strcat} reduce }
/map { 1 index xcheck 3 1 roll [ 3 1 roll forall ] exch { cvx } if }
/reduce { exch dup first exch rest 3 -1 roll forall }
/first { 0 get }
/rest { 1 1 index length 1 sub getinterval }
/as-string { 20 string cvs dup length 13 gt { 0 7 getinterval } if }
/strcat { 2 copy length exch length add string dup 4 2 roll
3 copy pop 0 exch putinterval exch length exch putinterval }
/for-each { % dict proc key(int) value *proc*
1 1 3 index length 1 sub % d p 1 1 lim
[ 6 5 roll % p 1 1 lim [ d
1 /index cvx /get cvx % p 1 1 lim [ d 1 index get
9 8 roll /exec cvx ] cvx % 1 1 lim { d 1 index get p exec }
for
}
>>
{ dup {
dup type /arraytype ne {
def
}{ % Dict name proc
[ 3 index /begin cvx
3 -1 roll {} forall
/end cvx
] cvx
def
} ifelse
} forall pop
} pop
begin
.create-pdf-data
答案 0 :(得分:2)
将文件放在某处,而不是粘贴它会有所帮助。 PDF文件是二进制的,长度计算取决于CR / LF对,这意味着每个/长度可能不正确,并且无法通过查看粘贴的文件来判断。
类似地,外部参照表偏移可能不正确。实际上,条目1的偏移看起来不正确,即使假设是LF EOL,但是无法从粘贴的文件中确定它。
请注意,错误消息来自Ghostscript(IM用于处理PDF文件)。如果您刚刚将PDF文件提供给Ghostscript,您可能会获得更多信息。您也可以尝试设置-DPDFDEBUG和-dPDFSTOPONERROR,组合将打印出GS正在处理的对象以及它认为的问题(如果存在PostScript错误)。其他PDF问题通常会发送某种反向通道输出。
请注意,Ghostscript消息引用了外部参照表作为问题:
****警告:读取XREF表时发生错误。
所以我怀疑你的外部参照表不正确(另见下面的对象0)。
不破坏,但不是最佳做法:
xref条目0,自由对象链表的头部,偏移量为0000000028应为0.
您的文件似乎结束了$$ EOF而不是%% EOF。
通常的做法是将二进制文件放在第2行的注释中,以便强制应用程序在传输时将文件视为二进制文件
最好忽略Resources字典而不是使用null对象,它更小。
同样,最好再次使用单个内容流(尽管最近的Adobe引擎生成多个流),因为它更小。
显然这是一项正在进行中的早期工作,我相信你会及时处理这些问题。
如果你要在某个地方发布实际的PDF文件,我可以看看。
[编辑]
所以第一个问题是外部参照表子部分不正确。该小节以2个数字,初始索引和表格中的条目数开头。外部参照表有5个条目,从索引0开始,一直到索引4.小节说
0 4
将其更正为0 5会导致我们出现下一个问题,预告片词典中的“大小”条目为4,应为5。
但Ghostscript仍在抱怨。
最后一个问题是startxref偏移是不正确的。目前这是:
startxref 1581
但'xref'关键字的实际字节偏移量是字节1576。
如果我纠正了所有这三个问题,那么Ghostscript会毫无怨言地打开文件。它已经确实渲染了内容(非常小,因为没有CTM操作)但现在它不需要修复文件。