以下代码段来自“http://pdfx.cs.man.ac.uk/usage”。这是一个非常棒的工具,它将pdf中的科学论文转换为xml。
curl --data-binary @"/path/to/my.pdf"
-H "Content-Type: application/pdf"
-L "http://pdfx.cs.man.ac.uk"
这段代码是unix命令行代码,我想要它的PHP版本。我试过了
$pdfFile = fopen('jucs_18_05_0623_0649_hasan.pdf', 'r');
$fileSize = filesize ('jucs_18_05_0623_0649_hasan.pdf');
$url="http://pdfx.cs.man.ac.uk";
$ch=curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //
curl_setopt($ch, CURLOPT_TIMEOUT, 100);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_PUT, true);
curl_setopt($ch, CURLOPT_INFILE, $pdfFile);
curl_setopt($ch, CURLOPT_INFILESIZE, $fileSize);
curl_setopt($ch, CURLOPT_VERBOSE, true);
$fp = fopen("test.xml", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
if (! $res = curl_exec($ch))
echo "Error: ".curl_error($ch);
else {
echo "Success";
}
curl_close($ch);
问题是test.xml的输出是索引文件html代码而不是转换后提供的文章的xml版本。
等待你的专家意见......
提前致谢
答案 0 :(得分:1)
不需要。需要的内容长度。
<?php
$pdfFile = fopen('1.pdf', 'r');
$fileSize = filesize ('1.pdf');
$url="http://pdfx.cs.man.ac.uk";
$ch=curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //
curl_setopt($ch, CURLOPT_TIMEOUT, 100);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: application/pdf","Content-length: ".$fileSize));
curl_setopt($ch, CURLOPT_INFILE, $pdfFile);
curl_setopt($ch, CURLOPT_INFILESIZE, $fileSize);
curl_setopt($ch, CURLOPT_VERBOSE, true);
$fp = fopen("test.xml", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
if (! $res = curl_exec($ch))
echo "Error: ".curl_error($ch);
else {
echo "Success";
}
curl_close($ch);
?>