参考:http://git.macropus.org/2011/11/pdftotext/example/
在这个项目中,开发人员将pdf作为输入并将其传递给变量"输入"。 我想创建一个上传菜单/ dropzone ,以便任何人都可以上传他们的pdf并自动将其传递给变量"输入"和文字可以提取。 我可以上传文件 但不知道如何将该pdf传递给变量"输入"。
widget1 = tkinter.Entry(root,width=20,etc)
widget2 = tkinter.Label(root,text='Hello',etc)
## Now perform update_idletasks() which will force
## Tkinter to calculate their size.
widget1.update_idletasks()
widget2.update_idletasks()
## Now calculate whether they'll fit.
if widget1.winfo_width() + widget2.winfo_width() > screensize[0]:## Not enough space
widget1.grid(row=0)
widget2.grid(row=1)
else:## Both will fit on one line.
widget1.grid(row=0,column=0)
widget2.grid(row=0,column=1)
现在使用此表单将上传pdf,现在我们必须传递变量"输入"。
<body>
<form id="upload" method="post" action="upload.php" enctype="multipart/form-data">
<div id="drop">
Drop Here
<a>Browse</a>
<input id="inputx" src="./"type="file" name="upl" multiple />
</div>
<ul>
<!-- The file uploads will be shown here -->
</ul>
</form>
答案 0 :(得分:1)
您只需将Pdf.js指向您上传的文件的副本即可。
在上面的代码中,Pdf.js通过XMLHttpRequest获取其数据,其中它查找.pdf,文件名定义为ID为src
的元素的input
属性:
xhr.open('GET', input.getAttribute("src"), true);
如果将此元素的src
属性设置为您上传到服务器的pdf的文件路径,则脚本应该按原样运行。
以下是一些可能对您有所帮助的代码 - index.html
是一个简单的文件上传表单,它调用PHP将文件上传到它所服务的同一目录(index.html
)。 file_upload.php
保存上传的文件,并使用以下行在iframe上设置src
属性的值:
<iframe id="input" src= <?php print $_FILES['userfile']['name'] ?> ></iframe>
<强>的index.html 强>
<html>
<head>
<title>Converting PDF To Text using pdf.js</title>
</head>
<body>
<div>
<!-- the PDF file must be on the same domain as this page -->
<form enctype="multipart/form-data" action="file_upload.php" method="POST">
<input id="fileInput" type="file" name="userfile"></input>
<input type="submit" value="Submit">
</form>
</div>
</body>
</html>
<强> file_upload.php 强>
<?php
$uploadfile = basename($_FILES['userfile']['name']);
echo '<pre>';
if (move_uploaded_file($_FILES['userfile']['tmp_name'], $uploadfile)) {
echo "File is valid, and was successfully uploaded.\n";
} else {
echo "Possible file upload attack!\n";
}
echo 'Here is some more debugging info:';
print_r($_FILES);
print "</pre>";
?>
<html>
<head>
<title>Converting PDF To Text using pdf.js</title>
<style>
html, body { width: 100%; height: 100%; overflow-y: hidden; padding: 0; margin: 0; }
body { font: 13px Helvetica,sans-serif; }
body > div { width: 48%; height: 100%; overflow-y: auto; display: inline-block; vertical-align: top; }
iframe { border: none; width: 100%; height: 100%; }
#output { padding: 10px; box-shadow: 0 0 5px #777; border-radius: 5px; margin: 10px; }
#processor { height: 70px; }
</style>
</head>
<div>
<!-- embed the pdftotext web app as an iframe -->
<iframe id="processor" src="../"></iframe>
<!-- a container for the output -->
<div id="output"><div id="intro">Extracting text from a PDF file using only Javascript.<br>Tested in Chrome 16 and Firefox 9.</div></div>
</div>
<div>
<iframe id="input" src= <?php print $_FILES['userfile']['name'] ?> ></iframe>
</div>
<script>
var input = document.getElementById("input");
var processor = document.getElementById("processor");
var output = document.getElementById("output");
window.addEventListener("message", function(event){
if (event.source != processor.contentWindow) return;
switch (event.data){
case "ready":
var xhr = new XMLHttpRequest;
xhr.open('GET', input.getAttribute("src"), true);
xhr.responseType = "arraybuffer";
xhr.onload = function(event) {
processor.contentWindow.postMessage(this.response, "*");
};
xhr.send();
break;
default:
output.textContent = event.data.replace(/\s+/g, " ");
break;
}
}, true);
</script>
</body>
</html>