我们需要使用Python-Pipeline处理用QuarkXPress 2018编写的超过1500页的书,以使用Triple Store中的数据。不幸的是,QuarkXpress没有提供有用的导出格式来达到我们的目的。最好的是HTML导出。
这对于文档中的“普通”文本效果很好。但是其他所有内容,例如表格,图片或特殊的文本块,始终位于HTML文档的末尾。根据原始书籍的观点,浏览器中的视图是正确的,因为所有元素都彼此绝对定位。
上图显示在浏览器视图的左侧,而HTML文档则显示在右侧。颜色包含相同的线条。如您所见,绿色,橙色,黄色,蓝色和紫色块位于红色块(红色块=普通文本)中,从而根据该书创建原始视图。
为解决该问题,我们尝试将HTML文档导入Python-Pipeline中,然后导入与PDF-Document相同的页面,并根据HTML-Tag在PDF-Document中的位置使用简单的字符串-匹配。它的效果还不错,但当然不适用于任何图片,而且出错率也很高。由于标记的元信息(标题,字幕,表格等)对我们也很重要,并且可以在HTML文档中使用,因此我们不仅只能使用PDF文档。
当浏览器以正确的顺序显示数据时,必须有一种计算顺序的方法。是否存在一种工具,可以按照从浏览器中显示的顺序对HTML标签进行排序?
您还找到了显示问题的更简单的HTML和CSS文件。
.para-NoStyle-127 {
font-family: 'Arial', 'ArialMT', 'Helvetica', 'sans-serif';
font-size: 140px;
font-weight: normal;
text-decoration: none;
-webkit-font-kerning: Normal;
font-kerning: Normal;
-webkit-font-variant-ligatures: no-common-ligatures;
font-variant-ligatures: no-common-ligatures;
-webkit-font-feature-settings: "rlig" 0, "liga" 0, "clig" 0, "calt" 0, "locl" 0, "ccmp" 0, "mark" 0, "mkmk" 0;
font-feature-settings: "rlig" 0, "liga" 0, "clig" 0, "calt" 0, "locl" 0, "ccmp" 0, "mark" 0, "mkmk" 0;
color: #222021;
position: absolute;
left: 0px;
top: 0px;
}
.QxpTextBox {
position: absolute;
left: 0px;
top: 0px;
white-space: nowrap;
width: 100%;
height: 100%;
line-height: 1;
transform-origin: 0% 0%;
-webkit-transform-origin: 0% 0%;
transform: scale(0.05, 0.05);
-webkit-transform: scale(0.05, 0.05);
}
.QxpVertTextBox {
position: absolute;
white-space: nowrap;
width: 100%;
height: 100%;
top: 0px;
right: 0px;
line-height: 1;
transform-origin: 100% 0%;
-webkit-transform-origin: 100% 0%;
transform: scale(0.05, 0.05);
-webkit-transform: scale(0.05, 0.05);
}
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>
237-276_DIV_CH
</title>
<meta content="width=369,user-scalable=no" name="viewport" />
<meta content="QuarkXPress 14.0.1" name="generator" />
<meta content="text/html;CHARSET=utf-8" http-equiv="Content-Type" />
<link href="ProjectCSS.css" rel="stylesheet" type="text/css" />
<link href="../assets/common.css" rel="stylesheet" type="text/css" />
<style type="text/css">
.panandzoom {
-webkit-transform-origin: 0px 0px;
-webkit-transition-property: -webkit-transform;
-webkit-transition-timing-function: ease;
}
a:link {
color: blue;
text-decoration: underline;
}
a:active {
color: red
}
a:visited {
color: purple
}
sub,
sup {
line-height: 0;
}
/* body style */
body {
background-color: white;
color: #222021;
}
#box1 {
position: absolute;
left: 339.661px;
top: 575.433px;
-webkit-transform: scale(1);
width: 28px;
height: 15px;
}
#box1_Props {
padding: 2px;
width: 24.346px;
height: 10.74px;
}
#box2 {
position: absolute;
left: 27.85px;
top: 43.243px;
-webkit-transform: scale(1);
width: 312px;
height: 532px;
}
#anchbox3 {
position: absolute;
-webkit-transform: scale(1);
width: 285px;
height: 45px;
display: inline-block;
}
#anchbox3_Props {
background-color: #E1ECF5;
width: 280.819px;
height: 41.304px;
padding: 2px;
}
#anchbox4 {
position: absolute;
width: 252px;
height: 44px;
display: inline-block;
}
#anchbox5 {
position: absolute;
width: 289px;
height: 91px;
display: block;
page-break-inside: initial !important;
}
#box6 {
position: relative;
-webkit-transform: scale(1);
width: 289px;
height: 11px;
}
#box7 {
position: relative;
-webkit-transform: scale(1);
width: 41px;
height: 19px;
}
#box8 {
position: relative;
-webkit-transform: scale(1);
width: 248px;
height: 19px;
}
#box9 {
position: relative;
-webkit-transform: scale(1);
width: 41px;
height: 20px;
}
#box10 {
position: relative;
-webkit-transform: scale(1);
width: 248px;
height: 20px;
}
#box11 {
position: relative;
-webkit-transform: scale(1);
width: 41px;
height: 27px;
}
#box12 {
position: relative;
-webkit-transform: scale(1);
width: 248px;
height: 27px;
}
#box13 {
position: relative;
-webkit-transform: scale(1);
width: 41px;
height: 13px;
}
#box14 {
position: relative;
-webkit-transform: scale(1);
width: 248px;
height: 13px;
}
#anchbox24 {
position: absolute;
width: 289px;
height: 123px;
display: block;
page-break-inside: initial !important;
}
#box25 {
position: relative;
-webkit-transform: scale(1);
width: 289px;
height: 11px;
}
#box26 {
position: relative;
-webkit-transform: scale(1);
width: 40px;
height: 36px;
}
#box27 {
position: relative;
-webkit-transform: scale(1);
width: 248px;
height: 36px;
}
#box28 {
position: relative;
-webkit-transform: scale(1);
width: 40px;
height: 27px;
}
#box29 {
position: relative;
-webkit-transform: scale(1);
width: 248px;
height: 27px;
}
#box30 {
position: relative;
-webkit-transform: scale(1);
width: 40px;
height: 19px;
}
#box31 {
position: relative;
-webkit-transform: scale(1);
width: 248px;
height: 19px;
}
#box32 {
position: relative;
-webkit-transform: scale(1);
width: 40px;
height: 27px;
}
#box33 {
position: relative;
-webkit-transform: scale(1);
width: 248px;
height: 27px;
}
#box43 {
position: absolute;
left: 342.496px;
top: 104.882px;
width: 91px;
height: 44px;
}
#box44 {
position: absolute;
left: 27.85px;
top: 19.843px;
-webkit-transform: scale(1);
width: 312px;
height: 20px;
}
#box44_Props {
border-width: 0.2px;
border-style: solid;
border-color: #222021;
width: 308.411px;
height: 16.341px;
padding: 1.5px;
}
a.nostyle {
text-decoration: none;
color: inherit;
}
.page {
position: relative;
overflow: hidden;
width: 369px;
height: 595px;
}
</style>
</head>
<body style="margin:0%;">
<div class="page" id="section1">
<div id="box1">
<div id="box1_Props">
<!-- bg -->
</div>
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="char-Normal-Local-22" style="top:-18.73px;">173</span>
</div>
</div>
<div id="box2">
<div class="QxpTextBox">
<div style="position:absolute;left:42.52px;top:0.00px;width:6194px;height:220px;background-color:#005BAA;">
<!--
Rule -->
</div>
<span class="char-Titulo_0_principal-Local-80" style="left:42.52px;top:10.90px;">Herzinsuffizienz (HII)</span>
<span class="char-Titulo_0_principal-Local-80" style="left:5716.00px;top:10.90px;">[I50.9]</span>
<div style="position:absolute;left:0.00px;top:191.19px;width:6236px;height:20px;background-color:#005BAA;z-index:-1;">
<!--
Rule -->
</div>
<span class="para-NoStyle-127" style="left:623.62px;top:951.69px;">-</span>
<span class="para-NoStyle-127" style="left:793.70px;top:951.69px;">Inzidenz in der Schweiz: 20’000 neue Fälle/Jahr</span>
<span class="para-NoStyle-127" style="left:623.62px;top:1112.68px;">-</span>
<span class="para-NoStyle-127" style="left:793.70px;top:1112.68px;">Mortalität der akuten HI: > 50 %/12 Mt. (ohne Therapie der Grundkrankheit)</span>
<span class="para-NoStyle-127" style="left:453.54px;top:1273.68px;">•</span>
<span class="para-NoStyle-127" style="left:623.62px;top:1273.68px;">Der wichtigste Faktor, welcher zur Verschlimmerung der HI beiträgt ist die neurohumorale </span>
<span class="para-NoStyle-127" style="left:623.62px;top:1434.68px;">Aktivierung (d.h. Aktivierung des Sympathikus und des RAAS)!</span>
<div style="position:absolute;left:0.00px;top:2710.48px;width:6236px;height:200px;background-color:#646364;">
<!--
Rule -->
</div>
<span class="para-NoStyle-127" style="top:2750.93px;">Klas:</span>
<span class="para-NoStyle-176" style="left:510.24px;top:2750.93px;">«Zeitliche» Klassifizierung</span>
<span class="para-NoStyle-127" style="left:453.54px;top:2968.62px;">1.</span>
<span class="para-NoStyle-127" style="left:623.62px;top:2968.62px;">Akute Herzinsuffizienz</span>
<span class="para-NoStyle-127" style="left:453.54px;top:3129.62px;">2.</span>
<span class="para-NoStyle-127" style="left:623.62px;top:3129.62px;">Akute Dekompensation einer Herzinsuffizienz</span>
<span class="para-NoStyle-127" style="left:453.54px;top:3290.62px;">3.</span>
<span class="para-NoStyle-127" style="left:623.62px;top:3290.62px;">Chronische Herzinsuffizienz</span>
<span class="para-NoStyle-115" style="left:453.54px;top:7378.32px;">Tabelle: Klassifikation der Herzinsuffizienz nach der NYHA.</span>
<div style="position:absolute;left:283.46px;top:7635.24px;width:5953px;height:180px;background-color:#646364;">
<!--
Rule -->
</div>
<span class="para-NoStyle-176" style="left:510.24px;top:7655.68px;">ACC/AHA Klassifikation</span>
<span class="para-NoStyle-327" style="left:2099.73px;top:7661.29px;font-size:91px;"> </span>
<span class="para-NoStyle-195" style="left:2125.01px;top:7672.61px;">[</span>
<span class="para-NoStyle-195" style="left:2158.35px;top:7672.61px;">ACC/AHA Guidelines. JACC 2001;38:2101</span>
<span class="para-NoStyle-195" style="left:4432.96px;top:7672.61px;">]</span>
<span class="para-NoStyle-115" style="left:453.54px;top:10337.87px;">Tabelle: ACC/AHA Klassifikation der HI.</span>
</div>
<div style="position:absolute;left:23.68px;top:81.61px;">
<div id="anchbox3">
<div id="anchbox3_Props">
<!-- bg -->
</div>
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-9" style="top:-5.43px;">Für die PRAXIS</span>
<div style="position:absolute;left:0.00px;top:137.76px;width:1027px;height:20px;background-color:#005BAA;z-index:-1;">
<!--
Rule -->
</div>
<span class="para-NoStyle-18" style="top:183.91px;">Es ist essentiell, die URSACHE des Syndroms «Herzinsuffizienz» zu suchen!</span>
<span class="para-NoStyle-18" style="top:344.91px;">Die häufigsten Ursachen der HI in den westlichen Ländern sind:</span>
<span class="para-NoStyle-18" style="top:505.91px;">•</span>
<span class="para-NoStyle-18" style="left:170.08px;top:505.91px;">Koronare Herzkrankheit (KHK) → i.d.R. systolische Dysfunktion </span>
<span class="para-NoStyle-18" style="top:666.91px;">•</span>
<span class="para-NoStyle-18" style="left:170.08px;top:666.91px;">Arterielle Hypertonie (AHT) → i.d.R. diastolische Dysfunktion</span>
</div>
</div>
</div>
<div style="position:absolute;left:31.82px;top:174.41px;">
<div id="anchbox4">
<img alt="77.png" class="qxpASDImage" height="45" src="assets/77.png" width="253" />
</div>
</div>
<div style="position:absolute;left:22.68px;top:391.66px;">
<div id="anchbox24">
<table class="table1_1">
<colgroup>
<col style="width:14.07%;" />
<col style="width:85.93%;" />
</colgroup>
<tbody>
<tr>
<td class="td1_1" colspan="2">
<div id="box25">
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-36" style="top:-7.62px;">ACC/AHA Klassifikation der Herzinsuffizienz</span>
</div>
</div>
</td>
</tr>
<tr>
<td class="td1_11">
<div id="box26">
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-36" style="top:-16.61px;">Grad A</span>
</div>
</div>
</td>
<td class="td1_12">
<div id="box27">
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-127" style="top:-16.61px;">•</span>
<span class="para-NoStyle-127" style="left:49.01px;top:-16.61px;"> </span>
<span class="para-NoStyle-127" style="left:170.08px;top:-16.61px;">Patienten mit hohem Risiko, eine HI zu entwickeln (z.B. art. Hypertonie, </span>
<span class="para-NoStyle-127" style="left:170.08px;top:144.39px;">KHK, Diabetes mellitus, Alkoholabusus, Kokainabusus u.a.).</span>
<span class="para-NoStyle-127" style="top:305.39px;">•</span>
<span class="para-NoStyle-127" style="left:49.01px;top:305.39px;"> </span>
<span class="para-NoStyle-127" style="left:170.08px;top:305.39px;">Keine strukturellen oder funktionellen Myokard-, Perikard- oder </span>
<span class="para-NoStyle-127" style="left:170.08px;top:466.39px;">Klappenabnormitäten. Keine Symptome.</span>
</div>
</div>
</td>
</tr>
<tr>
<td class="td1_6">
<div id="box28">
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-36" style="top:-16.61px;">Grad B</span>
</div>
</div>
</td>
<td class="td1_7">
<div id="box29">
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-127" style="top:-16.61px;">•</span>
<span class="para-NoStyle-127" style="left:49.01px;top:-16.61px;"> </span>
<span class="para-NoStyle-127" style="left:170.08px;top:-16.61px;">Patienten mit struktureller Herzkrankheit, welche aber keine Symptome </span>
<span class="para-NoStyle-127" style="left:170.08px;top:144.39px;">oder Befunde einer HI aufweisen (z.B. linksventrikuläre Hypertrophie oder </span>
<span class="para-NoStyle-127" style="left:170.08px;top:305.39px;">Dilatation, Status nach Myokardinfarkt u.a.).</span>
</div>
</div>
</td>
</tr>
<tr>
<td class="td1_2">
<div id="box30">
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-36" style="top:-16.61px;">Grad C</span>
</div>
</div>
</td>
<td class="td1_3">
<div id="box31">
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-127" style="top:-16.61px;">•</span>
<span class="para-NoStyle-127" style="left:49.01px;top:-16.61px;"> </span>
<span class="para-NoStyle-127" style="left:170.08px;top:-16.61px;">Patienten mit aktuellen oder vorgängigen HI-Symptomen, welche einer </span>
<span class="para-NoStyle-127" style="left:170.08px;top:144.39px;">strukturellen Herzkrankheit zuzuordnen sind.</span>
</div>
</div>
</td>
</tr>
<tr>
<td class="td1_6">
<div id="box32">
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-36" style="top:-16.61px;">Grad D</span>
</div>
</div>
</td>
<td class="td1_7">
<div id="box33">
<div class="QxpTextBox" style="left:2px;top:2px;">
<span class="para-NoStyle-127" style="top:-16.61px;">•</span>
<span class="para-NoStyle-127" style="left:49.01px;top:-16.61px;"> </span>
<span class="para-NoStyle-127" style="left:170.08px;top:-16.61px;">Patienten mit fortgeschrittener struktureller Herzkrankheit und schweren </span>
<span class="para-NoStyle-127" style="left:4659.85px;top:-16.61px;"> </span>
<span class="para-NoStyle-127" style="left:170.08px;top:144.39px;">HI-Symptomen, trotz max. medikamentöser Therapie (inkl. Spezialisten-ein-</span>
<span class="para-NoStyle-127" style="left:170.08px;top:305.39px;">griffen).</span>
</div>
</div>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div id="box43">
<img alt="NORME_onglet.png" class="qxpASDImage" height="44" src="assets/NORME_onglet.png" width="91" />
</div>
<div id="box44">
<div id="box44_Props">
<!-- bg -->
</div>
<div class="QxpTextBox" style="left:1.7px;top:1.7px;">
<span class="para-NoStyle-127" style="top:-2.40px;">Der Update 2018 des Kapitels HERZINSUFFIZIENZ kann wie folgt eingesehen werden:</span>
<span class="char-Normal-Local-374" style="top:168.89px;"></span>
<span class="para-NoStyle-36" style="left:131.80px;top:164.61px;"> http://www.investimed.ch/HERZINSUFFIZIENZ_2018.pdf</span>
</div>
</div>
</div>
<script src="../assets/PressRunWidgets.js" type="text/javascript">
/* Script */
</script>
</body>
</html>