是否可以在露天使用apache POI转换?

时间:2016-01-18 10:26:02

标签: apache-poi alfresco

我试图在alfresco中使用apache poi变换器将excel文件转换为HTML,但目前尚未成功。

In <Project-home>/src/main/amp/config/alfresco/extension/subsystems/Transformers/default/default/transformers.properties

   content.transformer.Poi.priority=70
   content.transformer.Poi.extensions.xlsx.html.supported=true

然后我设置了 log4j.logger.org.alfresco.repo.content.transform.TransformerDebug=TRACE log4j.logger.org.alfresco.util.exec.RuntimeExec=TRACE ,但我在日志中看到没有调用变压器。关于转化为卓越。

编辑: Mimetypes webscript(GET /alfresco/s/mimetypes?mimetype={mimetype?})返回

application/vnd.openxmlformats-officedocument.spreadsheetml.sheet - xlsx
Extractors: org.alfresco.repo.content.metadata.PoiMetadataExtracter
Transformable To:

    application/eps = Complex via: application/pdf
    application/pdf = Using a Direct Open Office Connection
    application/vnd.ms-excel = Using a Direct Open Office Connection
    application/vnd.oasis.opendocument.spreadsheet = Using a Direct Open Office Connection
    application/vnd.oasis.opendocument.spreadsheet-template = Using a Direct Open Office Connection
    application/vnd.sun.xml.calc = Using a Direct Open Office Connection
    application/vnd.sun.xml.calc.template = Using a Direct Open Office Connection
    application/xhtml+xml = org.alfresco.repo.content.transform.TikaAutoContentTransformer
    image/bmp = Complex via: application/pdf
    image/cgm = Complex via: application/pdf
    image/gif = Complex via: application/pdf
    image/ief = Complex via: application/pdf
    image/jp2 = Complex via: application/pdf
    image/jpeg = org.alfresco.repo.content.transform.OOXMLThumbnailContentTransformer
    image/png = Complex via: application/pdf
    image/tiff = Complex via: application/pdf
    image/vnd.adobe.photoshop = Complex via: application/pdf
    image/vnd.adobe.premiere = Complex via: application/pdf
    image/x-cmu-raster = Complex via: application/pdf
    image/x-dwt = Complex via: application/pdf
    image/x-portable-anymap = Complex via: application/pdf
    image/x-portable-bitmap = Complex via: application/pdf
    image/x-portable-graymap = Complex via: application/pdf
    image/x-portable-pixmap = Complex via: application/pdf
    image/x-raw-adobe = Complex via: image/jpeg
    image/x-raw-canon = Complex via: image/jpeg
    image/x-raw-fuji = Complex via: image/jpeg
    image/x-raw-hasselblad = Complex via: image/jpeg
    image/x-raw-kodak = Complex via: image/jpeg
    image/x-raw-leica = Complex via: image/jpeg
    image/x-raw-minolta = Complex via: image/jpeg
    image/x-raw-nikon = Complex via: image/jpeg
    image/x-raw-olympus = Complex via: image/jpeg
    image/x-raw-panasonic = Complex via: image/jpeg
    image/x-raw-pentax = Complex via: image/jpeg
    image/x-raw-red = Complex via: image/jpeg
    image/x-raw-sigma = Complex via: image/jpeg
    image/x-raw-sony = Complex via: image/jpeg
    image/x-xbitmap = Complex via: application/pdf
    image/x-xpixmap = Complex via: application/pdf
    image/x-xwindowdump = Complex via: application/pdf
    text/html = org.alfresco.repo.content.transform.PoiHssfContentTransformer
    text/plain = org.alfresco.repo.content.transform.TikaAutoContentTransformer
    text/xml = org.alfresco.repo.content.transform.TikaAutoContentTransformer

显示变压器

1 个答案:

答案 0 :(得分:2)

我通过创建路径XLSX =&gt;的复杂转换管道解决了这个问题。 PDF =&gt; HTML。我使用coolwanglu's html2pdfEX安装起来有点棘手,所以使用this script在ubuntu上安装,不要在CentOS上安装&lt; 7因为python存在问题。 至于延期:
的src /主/安培/配置/露天/扩展/子系统/变压器/默认/默认/ transformers.properties

#increase the maximum defaults allowed size
content.transformer.OpenOffice.extensions.xlsx.pdf.maxSourceSizeKBytes=5120

#disable ootb pdf->html and xlsx->html transformation path (Apparently has no effect)
content.transformer.OpenOffice.extensions.xlsx.html.supported=false
content.transformer.complex.OpenOffice.PdfBox.extensions.*.html.available=false
content.transformer.complex.OpenOffice.PdfBox.extensions.*.html.supported=false

#PDF to html transformer
content.transformer.pdf2htmlex.available=true
#content.transformer.pdf2htmlex.thresholdCount=5
#content.transformer.default.timeoutMs=180000
content.transformer.pdf2htmlex.priority=50
content.transformer.pdf2htmlex.extensions.pdf.html.supported=true
content.transformer.pdf2htmlex.extensions.pdf.html.priority=50
content.transformer.pdf2htmlex.extensions.pdf.html.maxSourceSizeKBytes=9999


#XLSX to HTML pipeline
content.transformer.complex.Xlsx.Html.pipeline=*|pdf|*
content.transformer.complex.Xlsx.Html.available=true
content.transformer.complex.Xlsx.Html.extensions.xlsx.html.priority=30
content.transformer.complex.Xlsx.Html.extensions.xlsx.html.supported=true

变形金刚豆:

<bean id="transformer.worker.pdf2htmlex"
      class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformerWorker">
    <property name="mimetypeService">
        <ref bean="mimetypeService"/>
    </property>
    <property name="checkCommand">
        <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="commandsAndArguments">
                <map>
                    <entry key=".*">
                        <list>
                            <value>pdf2htmlEX</value>
                            <value>-v</value>
                        </list>
                    </entry>
                </map>
            </property>
            <!--<property name="errorCodes">
                <value>1</value>
            </property>-->
        </bean>
    </property>
    <property name="transformCommand">
        <bean class="org.alfresco.util.exec.RuntimeExec">
            <property name="commandsAndArguments">
                <map>
                    <entry key=".*">
                        <list>
                            <value>pdf2htmlEX</value>
                            <value>--embed</value>
                            <value>CFIJO</value>
                            <value>${source}</value>
                            <value>${target}</value>
                        </list>
                    </entry>
                </map>
            </property>
            <property name="processDirectory" value="/"/>
        </bean>
    </property>
    <property name="explicitTransformations">
        <list>
            <bean class="org.alfresco.repo.content.transform.ExplictTransformationDetails">
                <constructor-arg>
                    <value>application/pdf</value>
                </constructor-arg>
                <constructor-arg>
                    <value>text/html</value>
                </constructor-arg>
            </bean>
        </list>
    </property>
</bean>

<bean id="transformer.pdf2htmlex" class="org.alfresco.repo.content.transform.ProxyContentTransformer"
      init-method="register"
      parent="baseContentTransformer">
    <property name="worker" ref="transformer.worker.pdf2htmlex"/>
    <!--The next two were added this because of the line at
    https://github.com/Alfresco/community-edition/blob/afde3f58f91567b6f7eaa0bbac5e5adc38087fe0/projects/repository/
    source/java/org/alfresco/repo/content/transform/AbstractContentTransformer2.java#L135 due to getting the
    following error on startup:
    Cannot create dynamic transformer transformer.complex.Xlsx.Html as sub transformers could not be found or
    created ("*|pdf|pdf2htmlex"). Incidentally it had no effect as the transformer properties need to be in the form
    *|pdf|*; but just in case this changes with future release of alfresco we leave this here and we are able to register custom transformers with the contentTransformerRegistry on startup.
    -->
    <property name="registry" ref="contentTransformerRegistry"/>
    <property name="registerTransformer" value="true"/>
</bean>